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ABSTRACT 

This  long  chapter  reviews  research  on  problem  solving  and  reasoning; 
the  intended  use  is  as  text  material  for  advanced  students  and  others 
needing  a  moderately  detailed  introduction  to  the  topics.  The 
orientation  is  primarily  psychological,  with  significant  attention 
given  to  results  from  artificial  intelligence.  Major  theoretical  concepts 
such  as  problem  representation,  the  problem  space,  strategic  knowledge, 
and  problem-solving  search,  are  developed  in  detail;  and  major  empirical 
methods  such  as  thinking -a loud  protocols,  problem-behavior  graphs,  and 
use  of  error  patterns  and  latencies,  are  described  and  illustrated. 
Sections  of  the  chapter  include:  Problems  with  well  specified  goals 
and  procedures,  Problems  of  design  and  arrangement,  Inductive  problem 
solving,  and  Evaluation  of  deductive  arguments. 


This  is  a  draft  of  a  chapter  to  appear  in  R.  C.  Atkinson,  R.  Herrnstein, 
G.  Lindzey,  and  R.  D.  Luce  (Eds.),  Stevens'  Handbook  of  Experimental 
Psychology.  (Revised  Edition).  New  York:  John  Wiley  &  Sons. 


PROBLEM  SOLVING  AND  REASONING 
James  G.  Greeno  and  Herbert  A.  Simon 

I.  Introduction 

Important  advances  have  been  achieved  in  the  1960's  and  1970's  in  the 
scientific  study  of  thinking.  These  advances  have  resulted  from  new 
methods  for  formulating  models  of  the  cognitive  processes  and  structures 
underlying  performance  in  complex  tasks,  and  the  development  of 
experimental  methods  to  test  such  models.  A  major  accomplishment  has  been 
the  discovery  of  general  forms  of  cognitive  activity  and  knowledge  that 
underlie  human  problem  solving  and  reasoning.  This  chapter  describes  a 
survey  of  the  major  theoretical  concepts  and  principles  that  have  been 
developed,  presents  some  of  the  evidence  that  supports  these  principles, 
and  discusses  the  empirical  and  theoretical  methods  that  are  used  in  this 
domain  of  scientific  study.  In  this  introductory  section,  we  give  an 
overview  of  the  major  concepts  that  will  be  described  in  detail  in  the 
chapter,  and  we  discuss  relations  between  these  concepts  and  issues  that 
have  been  investigated  previously  in  experimental  psychology.  We  also 
discuss  some  general  methodological  issues. 

I. A.  Overview  of  Concepts 

The  concepts  that  have  been  developed  can  be  discussed  conveniently  in 
two  groups:  hypotheses  about  the  form  of  cognitive  action,  and  hypotheses 
about  the  form  of  cognitive  representation.  The  hypotheses  about  cognitive 
action  extend  analyses  of  behavior  that  were  developed  in  general  behavior 
theory  by  investigators  such  as  Thorndike  (e.g.,  1923),  Tolman  (e.g., 

1928);  Skinner  (e.g.,  1938),  and  Hull  (e.g.,  1943).  The  hypotheses  about 
representation  extend  analyses  that  were  developed  by  Gestalt  psychologists 
such  as  Kohler  (1929),  Duncker  (1935/1945),  Katona  (1940),  and  Wertheimer 
(1945/1959).  One  of  the  important  insights  reached  in  the  .analysis  of 
problem  solving  is  that  hypotheses  about  these  issues  of  action  and 
representation  are  complementary,  and  both  are  necessary  components  of  a 
theory  of  human  thought.  We  will  discuss  the  two  groups  of  concepts  in 
turn  in  this  overview;  however,  in  the  sections  that  follow,  hypotheses 
about  action  and  representation  will  be  integrally  related  in  models  of 
performance  in  specific  tasks. 

I.A.l.  Form  of  Cognitive  Action.  Hypotheses  about  cognitive  action 
can  be  considered  at  two  levels:  basic  action  knowledge  and  strategic 
knowledge . 

A  consensus  has  developed  that  human  knowledge  underlying  cognitive 
action  can  be  represented  in  the  form  of  production  rules ,  a  formalism 
introduced  by  Post  (1943)  to  represent  reasoning  in  mathematics,  and 
adapted  for  application  to  psychology  by  Newell  and  Simon  (1972).  Models 
in  which  knowledge  for  action  is  represented  as  a  set  of  production  rules 
are  referred  to  as  production  systems. 

Any  theory  of  performance  must  include  hypotheses  about  the  process  of 
choice  whereby  individuals  select  the  actions  that  they  perform.  A 
production  system  provides  a  framework  for  expressing  hypotheses  about  this 
process  in  specific  detail.  A  production  rule  (or,  more  simply,  a 
production)  consists  of  a  condition  and  an  action.  The  condition  specifies 
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a  pattern  of  information  that  may  or  may  not  be  present  in  the  situation. 
The  action  specifies  something  that  can  be  performed.  The  general  form  of 
action  based  on  productions  is  simply:  if  the  condition  is  true,  perform 
the  action. 

In  a  production  system,  the  basic  problem  of  choice  among  actions  is 
solved  by  specifying  conditions  that  lead  to  the  selection  of  each  action 
that  can  be  performed.  The  condition  of  each  production  rule  is  a  pattern 
of  information  that  the  system  can  recognize.  These  patterns  include 
features  of  the  external  problem  situation  (the  stimulus).  They  also 
include  information  that  is  generated  internally  by  the  problem  solver  and 
held  in  short-term  memory.  The  internal  information  includes  goals  that 
are  set  during  problem  solving.  It  also  can  Include  information  in  memory, 
such  as  past  attempts  to  achieve  specific  goals.  Thus,  production  rules, 
which  represent  basic  action  knowledge,  consist  of  associations  between 
patterns  of  information  and  actions.  An  action  is  chosen  when  the 
individual  has  a  goal  with  which  the  action  is  associated  and  the  external 
stimulus  situation  as  well  as  information  in  memory  include  features 
associated  with  the  action. 

An  important  component  of  a  model  of  cognitive  activity  is  its 
representation  of  strategic  knowledge .  This  includes  processes  for  setting 
goals  and  adopting  general  plans  or  methods  in  working  on  a  problem. 
Analyses  of  general  problem-solving  strategies  have  been  developed  to 
simulate  performance  in  novel  problem  situations  where  the  individual  has 
little  or  no  experience.  One  major  analysis  is  based  on  a  process  of 
means-ends  analysis  (e.g.,  Newell  &  Simon,  1972)  in  which  goals  are 
compared  with  current  states,  and  actions  are  selected  to  reduce 
differences  that  are  identified.  General  strategies  also  include  processes 
for  setting  subgoals  when  the  current  goal  cannot  be  achieved  directly. 
Analyses  of  strategic  knowledge  in  specific  domains  also  have  been 
developed  to  simulate  performance  by  problem  solvers  who  have  received 
special  training  (e.g.,  Greeno,  1978).  Strategic  knowledge  of  experienced 
problem  solvers  includes  global  plans  for  solving  classes  of  problems  and 
knowledge  of  subgoals  that  are  useful  in  classes  of  problem  situations. 

The  general  ideas  used  in  formulating  hypotheses  about  cognitive 
activity  in  production  systems  can  be  regarded  as  building  upon,  rather 
than  negating,  the  concepts  developed  and  used  in  general  behavior  theory, 
particularly  the  formulations  of  Tolman,  and  the  later  forms  of  Hull's 
theory.  Early  expositions  of  behavior  theory  emphasized  the  direct 
relations  between  stimuli  and  responses,  with  rather  deliberate  inattention 
to  intervening  events  in  the  brain.  Thorndike  (1923)  emphasized  that 
actLons  are  chosen  because  of  their  associations  or  bonds  with  stimulus 
conditions.  In  Skinner's  (1938)  formulation,  actions  are  performed  under 
the  "control"  of  external  stimulus  features.  Tolman  (1928),  on  the  other 
hand,  strongly  recognized  the  need  to  Include  internal  goals  and 
information  stored  in  memory  in  the  determination  of  response  selection. 
Tolman  used  such  terms  as  "means-end  expectation"  and  "means-end  readiness" 
in  referring  to  these  factors.  In  Hull's  theory,  concepts  of  covert 
anticipatory  responses  (1930)  and  incentive  motivation  (1952)  were  used. 

In  discussions  of  problem  solving,  Maltzman  (1955)  and  Staats  (1966) 
postulated  stimulus-response  units  at  differing  levels  of  generality,  and 
the  idea  of  knowledge  about  action  at  different  levels  is  used  in  more 
recent  formulations  of  strategic  knowledge,  especially  in  hypotheses  about 
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planning. 

The  concept  of  a  production  rule  is  consistent  with  these 
formulations;  and  behavior  theory,  even  in  the  terms  used  by  Watson  and 
Skinner,  can  be  expressed  as  a  system  of  productions  (Millenson,  1967). 
However,  as  production  rules  are  used  in  contemporary  information 
processing  theory,  they  make  much  more  explicit  than  did  earlier  theories 
the  motivational  states  and  memories  of  prior  experiences  that  combine  with 
external  stimulus  conditions  to  determine  choice  of  a  response.  Modern 
production  system  models  of  problem  solving  and  similar  cognitive  processes 
may  be  viewed  as  a  (lengthy)  extrapolation  of  Tolman's  research  program 
that  symmetrizes  the  roles  of  external  environment  (stimulus)  and  inner 
environment  (motivational  states  and  memory  contents)  as  determinants  of 
response;  and  that  makes  far  more  explicit  than  earlier  formulations  were 
able  to  exactly  how  those  two  sources  of  information  control  responses.  We 
characterize  the  extrapolation  as  "lengthy"  because  not  only  does  it 
postulate  that  many  of  the  essential  components  of  the  stimulus  lie  in  the 
brain,  but  also  that  a  large  part  of  the  response  to  a  production  (or  all 
of  it)  may  be  internal  —  consisting,  for  example,  of  a  change  in  content 
of  short-term  or  long-term  memory.  We  do  not  want  to  understimate  the 
magnitude  of  the  shift  in  viewpoint,  but  we  do  wish  to  emphasize  that  it  is 
a  continuous  development  from  the  experimental  psychology  that  preceded  it, 
and  not  a  new  start.  That  is  presumably  what  Miller,  Galanter,  and  Pribram 
(1960)  also  meant  when  they  described  the  new  approach  (half  jokingly)  as 
"subjective  behaviorism."  ("Subjective,"  of  course,  referred  to  the  minds 
of  the  subjects,  not  to  the  scientific  methods  of  the  investigators.) 

One  major  difference  between  recent  hypotheses  about  cognitive 
activity  and  those  developed  in  general  behavior  theory,  in  addition  to  the 
shift  to  internal  events  in  behavior,  is  that  recent  formulations  are  much 
more  definite  and  specific.  Models  have  been  formulated  as  production 
systems  with  sufficient  specificity  that  they  can  be  expressed  as  computer 
programs  that  simulate  actual  performance  of  solving  specific  problems.  To 
do  this,  it  is  not  sufficient  to  postulate  the  existence  of 
stimulus-response  associations  and  goals,  even  at  differing  levels  of 
generality;  it  is  necessary  also  to  formulate  hypotheses  about  just  what 
the  stimuli,  responses,  and  goals  are.  Hypotheses  about  specific 
structures  of  knowledge  about  actions  and  goals  in  the  problem  domain  have 
to  be  constructed,  and  processes  have  to  be  designed  to  recognize  specific 
patterns  of  information  in  the  task  situation  that  are  relevant  to  solving 
problems.  Hypotheses  about  strategic  knowledge  have  to  specify  the 
conditions  in  which  goals  will  be  set  and  plans  will  be  adopted. 

Again,  we  prefer  to  emphasize  continuity,  rather  than  discontinuity  in 
this  development.  Nothing  in  the  new  fine-grained  mechanisms  is 
antithetical  to  the  grosser  level  of  description  of  the  earlier  theories. 

In  fact,  important  progress  has  been  made  in  explaining  in  detail  (and 
sometimes  quantitatively)  the  rich  body  of  experimental  data  provided 
within  the  behavioral  scheme  (e.g.,  Simon  &  Feigenbaum,  1964;  Gregg  & 
Simon,  1967).  But  the  impact  from  achieving  this  higher  level  of 
resolution  in  our  theoretical  models  and  their  predictions  has  led  to 
significantly  greater  understanding  of  the  psychological  processes  Involved 
in  problem  solving  and  reasoning. 
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I. A. 2.  Hypotheses  about  Representation .  Hypotheses  about  cognitive 
representations  of  problems  are  formulated  using  the  idea  of  a  problem 
space.  The  problem  space  includes  an  individual's  representation  of  the 
objects  in  the  problem  situation,  the  goal  of  the  problem,  and  the  actions 
that  can  be  performed  and  strategies  that  can  be  used  in  working  on  the 
problem.  It  also  includes  knowledge  of  constraints  in  the  problem 
situation:  restrictions  on  what  can  be  done,  as  well  as  limits  on  the  ways 
in  which  objects  or  features  of  objects  can  be  combined. 

In  developing  hypotheses  about  representation  of  problems,  much  use 
has  been  made  of  concepts  developed  in  analyses  of  language  understanding, 
including  networks  of  propositions  (Quillian,  1968;  Kintsch,  1974; 
Anderson,  1976),  procedural  representation  of  concepts  (Feigenbaum,  1963; 
Hunt,  Marin  &  Stone,  1966;  Winograd ,  1972),  and  schemata  (Schank,  1972; 
Hayes  i  Simon,  1974;  Norman  S  Rumelhart ,  1975;  Schank  &  Abelson,  1978). 
Representations  of  problems  differ  from  those  usually  postulated  for 
understanding  of  language  because  they  are  constrained  to  provide 
information  needed  for  solving  the  problem.  Hypotheses  about  knowledge 
used  in  representing  problems  include  processes  for  recognizing  features 
that  are  relevant  to  actions,  strategies,  and  constraints  of  the  problem 
domain,  and  for  constructing  representations  with  information  that  can  be 
used  in  the  cognitive  processes  of  problem  solving. 

Hypotheses  about  problem  representations  have  begun  to  address  some  of 
the  Issues  of  understanding  principles  and  structure  in  problem  solving 
that  were  emphasized  by  some  educational,  developmental,  and  Gestalt 
psychologists  (e.g.,  Judd,  1908;  Kohler,  1929;  Brownell,  1935;  Duncker, 
1935/1945;  Katona,  1940;  Piaget,  1941/1952;  Wertheimer,  1945/1959).  As 
with  hypotheses  about  cognitive  activity,  current  hypotheses  about 
representation  are  more  definite  and  specific  than  those  that  characterized 
previous  discussions.  The  hypotheses  specify  cognitive  processes  and 
structures  that  actually  construct  representations  from  the  texts  or  other 
presentations  of  problem  information  (e.g.,  Hayes  &  Simon,  1974;  Larkin, 
McDermott,  Simon  6  Simon,  1978;  Riley,  Graeno  &  Heller,  1983).  Hypotheses 
about  understanding  of  problem  structure  and  general  principles  include 
cognitive  structures  hat  specify  just  what  is  understood  about  the  problem, 
and  how  the  understanding  is  achieved  (Greeno,  1983;  Greeno,  Riley  a 
Geiman,  in  press).  Another  characteristic  of  recent  discussions  is  that 
hypotheses  about  understanding  are  coordinated  with  hypotheses  about 
cognitive  activity  in  problem  solving,  so  the  significance  of 
understanding,  as  well  as  the  specific  information  that  it  provides  for  the 
problem  solver,  is  made  clear. 

I.B.  Methodology 

The  use  of  computer  programming  languages  as  formal  systems  for 
psychological  theory  has  been  a  major  factor  in  the  development  of  the 
concepts  and  empirical  results  that  we  describe  in  this  chapter.  The 
standards  that  are  now  common  for  adequacy  of  a  hypothesis  include  its 
expression  in  a  computer  program  that  simulates  actual  solution  of  problems 
—  that  is,  a  description  of  the  problem  can  be  given  as  input  for  the 
program,  and  the  program  carries  out  steps  that  result  in  the  problem's 
being  solved.  To  meet  this  standard,  the  theorist  must  develop  specific 
hypotheses  about  many  aspects  of  the  psychological  process  that  were 
previously  left  unspecified.  Representations  of  specific  stimulus 
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situations  must  be  postulated,  including  relations  among  cues  that  are 
assumed  to  provide  important  information  for  the  subject.  Knowledge 
structures  and  processes  required  for  comprehension  of  stimulus  situations 
must  be  specified,  leading  to  specific  forms  of  information  that  are 
assumed  to  constitute  the  subject's  cognitive  representations  of  the 
stimuli.  Assumptions  about  knowledge  in  the  subject's  memory  are  specified 
in  detail,  including  associative  structures  of  information  and  production 
rules  in  which  specific  actions  are  associated  with  specific  stimulus 
conditions.  The  actions  include  overt  responses  as  well  as  internal 
actions  such  as  setting  goals  and  choosing  plans. 

To  provide  evidence  for  these  more  detailed  hypotheses,  more  detailed 
data  are  required.  A  major  source  of  these  data  has  been  the  increased  use 
of  thinking-aloud  protocols.  Protocols  provide  a  more  detailed  description 
of  behavior,  enabling  inferences  about  intermediate  steps  such  as  subgoals 
and  attention  to  specific  aspects  of  the  problem.  Protocol  statements  are 
not  treated  as  introspective  descriptions  of  psychological  processes,  but 
rather  as  overt  reports  of  mental  activity  that  the  subject  would  be  aware 
of  in  any  case,  but  usually  would  not  announce.  Indeed,  subjects  are 
instructed  to  avoid  trying  to  explain  their  behavior,  but  only  to  give 
reports  of  things  they  notice  or  think  about  as  they  are  working  (cf. 
Ericsson  &  Simon,  1980).  Statements  in  protocols  provide  data  to  be 
explained  by  models  that  constitute  hypotheses  about  the  process,  and  thus 
protocol  statements  have  the  same  status  as  other  detailed  observations, 
such  as  specific  patterns  of  errors  by  individuals  on  sets  of  problems, 
latencies  of  response  when  information  for  problems  is  presented 
sequentially,  or  eye  fixations  during  processing  of  problem  information. 

I.C.  Chapter  Contents 

The  remainder  of  this  chapter  is  organized  in  five  sections.  We  have 
organized  the  findings  and  conclusions  that  we  present  on  the  basis  of 
general  properties  of  the  cognitive  tasks  that  have  been  studied. 

Section  II  deals  with  problems  in  which  a  definite  goal  or  solution 
procedure  is  specified.  Analyses  of  problems  of  this  kind  have  been 
especially  important  in  the  development  of  concepts  and  methodology,  and  we 
have  devoted  more  space  to  Section  II  than  we  have  to  the  other  sections. 

In  Section  II  we  develop  general  theoretical  ideas,  such  as  the  problem 
space  and  heuristic  search,  that  are  used  without  detailed  development  in 
later  sections.  Section  II  also  includes  discussion  of  methodology  and 
empirical  evidence  in  more  detail  than  later  sections.  Conclusions 
presented  in  other  sections  are  based  on  evidence  similar  to  that  discussed 
in  Section  II,  although  space  did  not  permit  us  to  describe  the  empirical 
findings  as  fully  in  those  later  discussions. 

Examples  of  problems  specified  by  goals  or  procedures  include  logic 
exercises,  where  the  goal  is  to  derive  a  specified  expression,  and 
arithmetic  problems  in  which  a  child  must  perform  the  steps  of  subtraction. 
These  problems  present  a  situation  and  require  performance  of  a  sequence  of 
actions  that  transform  the  situation.  A  limited  set  of  problem-solving 
operators  are  permitted,  restricting  the  actions  that  can  be  performed. 

The  subject's  task  can  be  viewed  as  a  search  in  a  space  of  action 
sequences,  where  there  are  many  possible  sequences  of  actions,  only  a  few 
of  which  lead  to  the  problem  goal  and  conform  to  the  constraints  of  the 
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situation. 

In  Section  III  we  consider  problems  of  design  and  arrangement,  where 
goals  are  specified  in  terms  of  general  criteria,  rather  than  as  definite 
states  or  procedures.  A  familiar  example  is  an  anagram  problem,  where  a 
set  of  letters  is  to  be  arranged  into  a  word.  Problems  of  design  differ 
from  the  transformation  problems  discussed  in  Section  II  in  that 
constraints  are  Imposed  mainly  on  the  solution  state,  rather  than  on  the 
actions  that  can  be  used  in  achieving  the  state.  Thus,  design  problems  can 
he  understood  as  problems  of  search  in  1  space  that  contains  nany  possible 
arrangements  of  the  problem  materials,  only  one  or  a  few  of  which  satisfy 
the  problem  criterion. 

In  Sections  IV  and  V  we  consider  tasks  that  are  often  called  reasoning 
rather  than  problem  solving.  Section  IV  ta'<es  up  problems  of  induction, 
and  Section  V  deals  with  deductive  syllogisms.  Analyses  of  processes 
involved  in  these  tasks  reveal  that  they  share  basic  char  icter 1st tcs  of  the 
processes  involved  in  tasks  ordinarily  considered  as  problem  solving, 
although  they  also  have  some  distinctive  features.  In  induction  problems, 
the  goal  is  to  Identify  the  structure  of  a  set  of  materials;  the  problems 
require  search  in  a  space  that  contains  many  possible  structural 
descriptions  or  rules,  most  of  which  are  inconsistent  with  some  features  of 
the  problem  information.  In  tasks  frequently  used  to  study  deductive 
reasoning,  problem  solver>  judge  whether  arguments  that  are  presented  are 
valid;  the  process  involves  an  ittempt  to  derive  the  conclusion  from  the 
premises,  a  search  for  a  sequence  of  inferential  actions  just  as  in 
problems  of  transformation. 

In  Section  VI  we  present  some  conclusions. 
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II.  WelL  Specified  Problems 


In  this  section  we  discuss  problem  solving  in  relatively  well 
structured  situations.  First,  we  consider  problems  in  which  a  definite 
goal  is  specified.  The  problem  solver  is  given  an  Initial  situation  or 
problem  state,  a  set  of  operators  that  can  be  used  to  change  the  situation, 
and  a  goal  state.  The  task  is  to  find  a  sequence  of  actions,  restricted  to 
use  of  the  permitted  operators,  that  results  in  the  goal  state.  In  Section 
II. A  we  discuss  goal-directed  problems  for  which  the  problem  solver  has 
little  or  no  specific  knowledge  or  experience,  so  that  the  problem  solving 
depends  on  using  general  problem-solving  knowledge  sometimes  called  "weak 
methods."  In  Section  II. B,  r*e  discuss  problems  of  the  same  structure  for 
which  individuals  have  recc  _-*!  special  training  or  experience,  thus 
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II. A.  General  Knowledge  for  Move  1  Problems  with  Specific  Goals 

A  substantial  body  of  research  has  been  conducted  on  solution  of 
well-structured  puzzle-like  problems  that  require  relatively  little 
domain-specif ic  knowledge.  By  hindsight,  the  research  strategy  of  focusing 
on  such  problems  has  some  advantages,  even  beyond  the  obvious  ones  of 
naking  the  experiments  simpler  and  the  data  easier  to  interpret.  In 
difficult  problem  domains  requiring  special  knowledge,  we  are  likely  to 
learn  from  our  subjects  principally  what  they  know  and  how  they  have 
organized  and  represented  their  knowledge  in  memory,  because  much  of  an 
individual's  success  depends  on  whether  he  or  she  knows  the  specific 
principles  and  procedures  of  the  domain. 


In  experiments  In  domains  that  are  relatively  free  of  specialized 
content  and  where  subjects  are  relatively  naive,  we  may  still  find 
significant  differences  in  behavior  from  subject  to  subject  and  from  domain 
to  domain,  but  we  also  are  likely  to  discover  some  of  the  commonalities  of 
behavior  that  characterize  problem  solving,  at  least  by  novices,  over  a 
wide  range  of  domains.  We  also  are  likely  to  detect  the  flexible, 
general-purpose  techniques  that  people  fall  back  on  when  they  do  not  have 
special  knowledge  or  methods  adapted  specifically  to  the  task  at  hand. 

These  fail-back  techniques,  often  called  "weak  methods,"  are  the  only 
weapons  that  are  available  for  attacking  truly  novel  problems.  Hence, 
understanding  them  should  contribute  also  to  an  understanding  of  discovery 
processes  and  creative  problem  solving. 

An  important  general  concept  in  the  analysis  of  problem  solving  is  the 
problem  space,  consisting  of  the  problem  solver's  representation  of  the 
materials  of  the  pr->b’em  along  with  knowledge  that  is  relevant  to  the  task. 
The  problem  space  includes  a  representation  of  the  problem  goal  and 
operators  that  can  be  used;  these  may  be  specified  in  the  problem 
description  or  supplied  by  the  problem  solver's  knowledge.  The  operators 
include  actions  that  can  be  performed  and  conditions  that  are  required  for 
performance  of  the  actions.  The  problem  space  also  includes  the  problem 
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solver's  strategic  knowledge,  which  may  include  methods  previously  acquired 
through  experience  in  the  domain,  as  well  as  general  problem-solving 
methods . 

In  this  section  we  discuss  tasks  in  which  definite  goals  are  specif  led 
in  the  problem  instructions.  Subjects  solving  these  problems  are  usually 
not  experienced  in  the  tasks.  The  problem-sol ving  aperitifs  ilso  ire 
specified  in  the  problem  instructions,  rather  than  being  known  in  advance 
by  the  problem  solvers,  and  the  problem  solvers  must  rely  an  general 
problem-solving  strategies,  that  is,  on  weak  methods.  The  principal 
methods  of  this  kind  employ  a  general  problem-solving  heuristic,  called 
means-ends  analysis,  a  process  that  involves  comparing  the  current  state 
with  the  goal  of  the  problem  or  a  subgoal  that  the  problem  solver  is  trying 
to  achieve,  and  selecting  an  operator  that  can  reduce  differences  between 
the  current  state  and  the  goal. 

Research  has  been  conducted  on  several  tasks  of  this  general  kind. 

Here  we  discuss  two  tasks:  proof  discovery  exercises  in  logi:  (Newell  ■< 
Simon,  1972),  and  water-jar  problems  (Atwood  a  Poison,  1976).  The  studies 
that  we  discuss  illustrate  use  of  two  empirical  methods.  Newel!  and 
Simon's  study  of  logic  proof  discovery  used  detailed  analyses  if 
thinking-aloud  protocols  obtained  from  a  few  subjects,  with  lira  fr  n  i 
larger  group  of  subjects  to  check  the  representativeness  of  some  general 
features  of  performance.  Atwood  and  Poison's  study  a  F  water  jug  pr  <blems 
used  frequencies  of  responses  that  occurred  during  problem  solving  to 
evaluate  a  model  of  problem  solving  expressed  in  quantitative  form. 

II. A. 1.  Discovering  Proofs  in  Logic .  Discovering  proofs  for 
mathematical  theorems  of  one  kind  or  another  Ls  i  task  all  of  us  have  faced 
frequently  in  school  and  a  few  of  us  in  our  professional  lives.  One  domain 
in  which  theorem  proving  has  been  studied  extensively  is  elementary 
symbolic  logic  (Moore  &  Anderson,  1954;  Newell,  Shaw,  4  Simon,  !9S7; 

Newell  &  Simon,  1972).  The  proposit ional  calculus  is  defined  by  only  two 
rules  of  Inference  and  a  dozen  axioms.  In  the  studies  that  we  discuss,  the 
task  was  presented  as  a  syntactic  game  of  transforming  strings  of 
un interpreted  symbols  according  to  rules  given  as  symbolic  formulas.  Tils 
ensured  that  subjects  could  not  draw  readily  on  such  comraonsense  knowledge 
as  they  may  have  had  of  the  laws  of  reasoning.  (The  studies  of  syllogistic 
reasoning  that  we  discuss  in  Section  V  directly  address  the  question  of 
subjects  knowledge  of  formal  logical  rules.) 

Deduction  and  Indue t ion  in  Problem  Solving .  At  the  outset  we  must 
deal  with  one  common  misconception  about  proof-finding  tasks.  Logic  is  the 
science  of  deductive  reasoning  from  premises  to  conclusions.  A  proof  is  a 
sequence  of  expressions  starting  with  axioms  (or  previously  proved 
expressions)  and  terminating  with  the  desired  theorem;  each  step  of  the 
proof  must  satisfy  the  laws  of  deduction.  Its  validity  can  be  checked, 
step  by  step,  by  applying  those  laws  systematically. 

F lading  the  proof  of  a  theorem  is  another  matter.  We  have  a  known 
starting  point  —  the  axioms  --  and  a  known  goal  —  the  theorem  —  but  in 
most  mathematical  domains  there  Is  no  systematic  rule  for  constructing  a 
path  from  axioms  to  theorem.  That  path  must  be  discovered,  and  the  usual 
method  for  discovering  it  is  to  search  for  i1-,  the  amount  of  trial  and 
error  required  depending  on  how  selectively  the  search  is  carried  out. 
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Hence,  while  a  proof  is  an  example  of  a  logical  deduction,  the 
problem-solving  activity  involved  in  searching  for  a  proof  is  inductive 
search,  as  Is  most  interesting  problem  solving  whatever  the  task  domain  may 
be . 


The  Moore-Anderson  Logic  Problems .  In  the  logic  task  designed  by 
Moore  and  Anderson  (1954),  subjects  were  not  told  that  they  were 
discovering  proofs  in  symbolic  logic,  but  were  simply  instructed  to 
"recode"  certain  strings  of  symbols  into  other,  specified,  strings,  using  a 
given  set  of  transformation  rules.  The  rules  were  displayed  on  a  sheet  of 
paper,  which  was  available  to  the  subjects  at  all  times,  A  typical  rule 
(there  were  twelve,  some  with  subparts)  was: 

A  V  3  — »  3  V  A  , 

which  was  to  be  interpreted:  The  expression  A  =*v  B  may  be  transformed  into 
the  expression  B  V  A,  where  A  and  B  are  variables  for  which  any  parts  of 
an  expression  can  be  substituted.  The  connectives  in  such  expressions  were 
referred  to  by  the  experimenter  as  "wedge,"  "dot,"  "horseshoe,"  and 
"tilde,"  instead  of  being  given  their  usual  interpretations  in  logic  as 
"or,"  "and,"  "implies,"  and  "not."  Subjects  were  run  on  this  task  by 
Carpenter,  Moore,  Snyder,  and  Lysansky  (1961)  at  Yale,  and  by  Newell  and 
Simon  (1972)  at  Carnegie  Institute  of  Technology. 

Several  kinds  of  data  can  be  obtained  in  problem-solving  tasks  of  this 
kind.  The  times  to  solution  can  be  recorded,  as  well  as  the  times  for 
making  each  successive  transformation  of  an  expression.  Numbers  of  correct 
solutions  can  be  counted,  and  errors  can  be  classified  and  analysed. 

Thlnking-aloud  Protocols.  The  richest  data,  however,  are  obtained  by 
instructing  subjects  to  think  aloud  while  solving  the  problem.  The  verbal 
protocols  provide  a  higher  temporal  density  of  data  than  is  usually 
obtained  by  other  methods  (except,  perhaps,  from  records  of  eye  movements). 
Typically,  subjects  speak  at  an  average  rate  of  about  two  words  per  second, 
although  there  are  of  course  substantial  differences  among  subjects  and 
from  one  part  of  a  task  to  another. 

If  thlnking-aloud  data  are  to  be  used  correctly  and  effectively  to 
help  understand  subjects'  cognitive  processes,  answers  are  needed  to 
several  questions,  especially:  (1)  which  processes,  or  what  parts  of  the 
processes,  are  verbalized,  and  (2)  to  what  extent  does  verbalization  alter 
or  in  any  way  affect  the  problem-solving  process  itself.  A  recent 
extensive  review  of  relevant  literature  (Ericsson  &  Simon,  1980)  supports 
three  general  conclusions.  First,  subjects  mainly  verbalize  a  subset  of 
the  symbols  that  pass  through  STM  as  the  task  is  being  performed.  The 
verbalizations  will  be  more  complete  (l.e.,  will  give  a  fuller  record  of 
successive  STM  contents)  if  the  problem  is  being  solved  in  terms  of  verbal 
symbols  than  if  the  STM  contents  have  to  be  translated  from  some  other 
modality  (l.e.,  visual  images).  Second,  the  process  of  recognizing  some 
familiar  visual  or  auditory  stimulus  does  not  produce  any  intermediate 
symbols  in  STM  that  can  be  reported;  only  the  result  of  the  recognition 
process  can  be  reported.  Third,  in  most  problem-solving  tasks,-  the 
cognitive  processes  are  the  same  in  the  thinking-aloud  as  in  the  silent 
condition.  Moreover,  in  general,  the  speed  of  task  performance  is  neither 
increased  nor  decreased  by  the  instructions  to  think  aloud. 
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The  protocols  under  discussion  here  are  those  produced  by  subjects 
concurrently  with  their  performing  the  cognitive  task.  In  using 
retrospective  protocols  as  data,  additional  factors  must  be  taken  into 
consideration.  First,  only  such  information  can  be  reported 
retrospectively  as  has  been  transfered  to  LTM  and  retained  there.  Second, 
.tless  the  instructions  call  for  recall  of  specific  events,  subjects  may 
engage,  in  a  variety  of  ways,  in  active  reconstruction  of  the  event  or 
process  that  is  being  probed.  Hence,  retrospective  protocols  must  be 
interpreted  in  the  light  of  what  we  know  about  the  laws  of  memory  and 
forgetting  (Bartlett,  1932;  Nlsbett  &  Wilson,  1977). 

The  most  detailed  analysis  of  problem-solving  protocols  calls  for 
reconstructing  from  them  the  successive  cognitive  states  of  subjects  as 
they  work  toward  the  problem  solution.  "Cognitive  state"  means  what  the 
subject  knows  or  has  found  out  about  the  problem  up  to  the  time  of  the 
protocol  fragment  being  examined,  along  with  information,  such  as  subgoals 
and  evaluations,  that  has  been  generated  by  the  subject  from  decisions  and 
judgments.  Typically,  in  tasks  like  the  logic  theorem  proving  task, 
subjects  verbalize  the  symbolic  expressions  they  produce  and  those  they  are 
actively  considering,  the  operators  they  are  applying  to  transform 
expressions,  and  often  the  goals  they  are  trying  to  attain  (e.g.,  the  final 
theorem  or  expressions  they  think  would  bring  them  closer  to  it)  (Newell  & 
Simon,  1972).  As  they  proceed,  subjects  often  evaluate  their  progress  and 
the  suitability  of  steps  they  have  just  taken. 

From  such  protocol  statements  we  can  usually  reconstruct  the  problem 
space  in  which  a  subject  is  operating.  Recall  that  a  problem  space 
includes  a  subject's  representation  of  the  problem  situation,  the  goal, 
problem-solving  operators,  constraints,  and  strategic  knowledge.  More 
formally,  a  problem  space  is  defined  by  a  set  of  symbol  structures, 
corresponding  to  the  cognitive  states  that  can  be  generated  as  the  subject 
works  on  the  task,  and  a  set  of  cognitive  operators,  information  processes 
that  produce  new  cognitive  states  from  existing  ones.  The  problem-solving 
efforts  of  a  subject  may  be  described  as  searches  through  a  problem  space, 
from  one  cognitive  state  to  another,  until  the  solution  (a  particular 
cognitive  state)  is  found  or  the  search  is  abandoned. 

Given  a  description  of  the  problem  space,  inferred  from  a  protocol,  a 
search  tree,  called  a  Problem  Behavior  Graph  (PBG),  can  be  constructed  to 
represent  the  course  of  the  subject's  search.  The  size  and  shape  of  the 
PBG  will  disclose  the  extent  of  the  subject's  skill  and  knowledge  and  the 
consequent  selectivity  he  is  able  to  achieve.  Given  the  PBG,  in  turn,  the 
experimenter  can  undertake  to  construct  a  simulation  program  for  a  computer 
which,  if  given  the  same  problem,  would  generate  the  same  PBG  as  that 
generated  by  the  subject. 

The  accuracy  of  fit  of  the  simulation  program  to  the  strategy  that 
guides  a  subject's  behavior  can  be  judged  by  comparing  the  program's  trace 
step  by  step  with  the  problem-solving  protocol.  Formal  methods  for  judging 
goodness  of  fit  in  a  statistical  sense  are  not  available,  but  departures  of 
trace  from  protocol  are  easy  to  detect.  These  discrepancies  then  form  the 
basis  for  modifying  the  simulation  program  to  fit  the  protocol  better. 
Except  for  the  fact  that  the  data  we  are  dealing  with  here  are  not 
numerical,  the  process  of  fitting  a  computer  program  to  protocol  data  is 
identical  in  principle ,with  the  process  of  fitting  a  system  of  differential 
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equations  to  time  series  data. 

A  basic  problem  space  for  the  logic  task  is  one  in  which  the  subject's 
cognitive  state  is  defined  by  the  logic  expressions  thus  far  derived  from 
the  initial  given  expression,  and  by  the  legal  operators  for  generating  new 
expressions  from  these.  Since  the  protocol  normally  discloses  both  what 
operators  are  being  applied  and  what  expressions  are  obtained  from  the 
application,  there  will  be  a  great  deal  of  redundancy  in  the  available 
information  to  test  the  consistency  of  the  interpretation.  Many  protocols 
will  allow  a  richer  problem  space  to  be  inferred  —  one  in  which  the 
subject  notes  similarities  and  differences  among  logic  expressions,  and 
chooses  his  or  her  next  step  in  terms  of  them.  When  the  subject's  choice 
of  actions  is  also  guided  by  goals  and  subgoals,  these  are  also  added  to 
the  description  of  the  problem  space. 

Solution  Processes.  No  single  strategy,  or  simulation  program  based 
on  such  a  strategy,  can  be  expected  to  describe  the  problem-solving 
behavior  of  all  subjects.  However,  the  behavior  of  a  great  many  subjects 
in  task  domains  like  logic  theorem  proving  reveals  a  small  number  of  common 
mechanisms  as  central  features  of  the  problem-solving  process.  One  of  the 
most  important  of  these  is  means-ends  analysis ,  first  introduced  into  the 
problem-solving  literature  by  Duncker  (1935/1945).  Means-ends  analysis 
requires  a  problem  space  rich  enough  to  contain  not  only  logic  expressions 
and  operators,  but  also  symbol  structures  that  describe  differences  between 
pairs  of  logic  expressions  and  other  symbol  structures  that  describe  goals. 
Thus,  a  subject  operating  in  such  a  problem  space  might  say,  "I  have  an 
expression  whose  main  connective  is  a  horseshoe,  and  my  goal  expression  has 
a  wedge.  Let  me  look  for  an  operator  that  will  change  horseshoe  to  wedge." 

In  broadest  outline,  means-ends  analysis  can  be  described  by  the 
following  set  of  productions,  where  S  is  the  present  state  or  expression,  G 
is  the  goal  expression,  D  is  a  difference  between  two  expressions,  and  0  is 
an  operator: 

If  the  goal  is  to  remove  difference  D  between  S  and  G 
— >  find  a  relevant  operator  0 

and  set  the  goal  of  applying  it. 

If  the  goal  is  to  apply  0  to  S, 

and  condition  C  for  applying  0  is  unsatisfied 
— ^  set  the  goal  of  satisfying  C  by  modifying  S. 

If  the  goal  is  to  apply  0  to  S  — ■>  make  application. 

If  there  is  a  difference  D  between  S  and  G 
— ^  set  the  goal  of  removing  it. 

If  there  is  no  difference  between  S  and  G 
— ^  halt  and  report  problem  solved. 

While  the  production  system  displayed  here  does  not  describe  all  the 
details  of  the  control  of  search,  it  provides  the  main  outlines  of 
means-erids  analysis.  The  system  seeks  to  detect  a  difference  between  the 
present  position  in  the  problem  space  and  the  goal  position.  Given  such  a 
difference,  it  searches  memory  for  an  operator  that  is  relevant  for 
removing  the  difference.  Having  found  an  operator,  it  attempts  to  apply 
it.  If  all  the  conditions  for  operator  application  are  not  satisfied,  it 
expresses  the  discrepancy  as  a  new  difference  and  establishes  the  goal  of 
reducing  it.  The  scheme  operates  recursively,  and  as  soon  as  one 
difference  has  been  removed,  it  looks  for  another.  An  important  component 
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of  the  strategy  not  represented  in  the  productions  Is  the  use  of  memory  to 
store  goals  that  have  been  tried,  so  the  problem  solver  can  avoid  looping 
through  the  same  cycle  of  repeated  unsuccessful  attempts  of  a  goal  that 
cannot  be  achieved. 

A  clear  distinction  can  be  made  between  the  general  strategy  of 
means-ends  analysis  and  domain-specific  knowledge  that  is  required  for  the 
strategy  to  be  used  in  solving  any  particular  problem.  The  general 
strategy  is  represented  in  the  productions  shown  above.  To  use  these 
productions,  a  problem  solver  must  be  able  to  represent  the  state  S  and  the 
goal  G,  and  identify  differences  between  them.  In  the  domain  of  logic, 
states  correspond  to  expressions,  and  differences  involve  different 
letters,  different  connectives,  and  different  arrangements  of  letters  and 
connectives.  The  problem  solver  also  must  know  what  operators  can  be  used, 
what  conditions  permit  each  operator  to  be  applied,  and  what  kinds  of 
difference  are  removed  by  use  of  each  operator.  In  logic,  the  operators 
are  the  rules  for  transforming  expressions.  "Hie  conditions  are  patterns 
that  are  specified  Ln  the  rules,  and  the  relevant  differences  for  a  rule 
can  be  inferred  by  comparing  the  two  sides  of  the  rule.  For  example, 

A*  B  — ^  A  requires  a  pattern  in  which  two  subexpressions  are  connected  by 
a  dot,  and  has  the  effect  of  removing  a  letter  or  a  subexpression,  as  well 
as  removing  the  dot.  Ad  B  K  )  %  AvB  does  not  remove  or  add  any  letters, 
It  can  be  applied  to  a  pattern  with  a  horseshoe  to  change  the  horseshoe  to 
a  wedge  or  vice  versa,  and  it  changes  the  sign  of  one  of  the  letters  or 
subexpressions. 

The  general  strategy  of  means-ends  analysis  has  been  implemented  in  a 
program  called  the  General  Problem  Solver  (GPS)  and  shown  to  be  sufficient 
for  providing  solutions  in  over  a  dozen  problem  domains,  including  puzzles 
such  as  the  Tower  of  Hanoi  and  tasks  such  as  integral  calculus,  given 
appropriate  representations  of  the  states,  operators,  and  connections 
between  operators  and  differences  in  the  specific  domains  (Ernst  &  Newell, 
1969). 


In  the  experiments  conducted  with  the  logic  task,  subjects  were  not 
experienced  in  the  domain.  The  operators  were  presented  as  part  of  the 
task  instructions,  and  it  i3  reasonable  to  expect  that  subjects  had  to  rely 
mainly  on  general  problem-solving  strategies,  rather  than  having 
domain-specific  knowledge  available  for  the  task.  If  this  is  correct,  and 
if  the  subjects'  general  problem-solving  strategies  have  the  properties  of 
GPS,  then  their  performance  in  the  logic  task  should  be  similar  to  that  of 
GPS  when  it  is  run  on  the  task.  The  results  were  quite  positive. 

Kinds  of  Evidence.  The  hypothesis  was  evaluated  at  three  levels. 
First,  specific  protocols  were  examined,  comparing  the  statements  made  by 
subjects  with  the  steps  in  solutions  by  specific  versions  of  GPS.  For 
these  simulations,  GPS  was  varied  by  supplying  it  with  differing  priorities 
of  differences.  Second,  a  set  of  protocols  (all  those  obtained  by  Newell 
and  Simon  on  one  moderately  difficult  problem)  were  coded  and  each  protocol 
was  translated  into  a  Problem-Behavior  Graph,  showing  a  succession  of 
cognitive  states  that  was  inferred  from  the  statements  and  problem-solving 
operators  to  account  for  the  transitions  between  states.  The 
state-to-state  transitions  were  classified,  and  the  categories  were 
compared  with  categories  of  activity  that  are  performed  by  GPS.  Third, 
some  summary  statistics  were  compiled  for  Newell  and  Simon's  subjects  and 
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for  the  subjects  run  at  Yale,  involving  the  frequencies  of  occurrence  of 
several  intermediate  steps  in  solutions  of  the  problems.  These  statistics 
were  compared  to  detect  any  gross  abnormalities  in  Newell  and  Simon's  data, 
compared  to  a  larger  group  of  subjects  who  solved  the  problem  with  pencil 
and  paper  without  the  requirement  of  thinking  aloud. 

As  Table  1  illustrates,  individual  protocols  can  often  be  simulated  in 
great  detal,  but  of  course  there  will  be  differences  among  individuals  in 
their  problem  solving  methods,  hence  in  the  production  systems  that  would 
describe  them.  For  purposes  of  psychological  theory,  we  are  often  less 
interested  in  the  details  of  a  particular  simulation  (except  as  a  very 
strong  test  of  the  theory)  than  we  are  in  the  structure  of  a  program  that 
simulates  the  main  mechanisms  revealed  in  a  whole  set  of  protocols.  The 
problem  of  averaging  over  groups  of  subjects  can  also  be  handled  formally 
by  comparing  the  statistics  of  behavior  of  a  program  with  the  statistics  of 
the  human  subjects  as  a  group.  In  this  section,  we  examine  the  processes 
for  comparing  programs  in  detail  with  individual  protocols,  and  in  Section 
II. A. 2  we  discuss  the  statistical  approach. 


Table  1  here 


Individual  Protocols.  Newell  and  Simon  presented  several  protocols  in 
which  activities  of  subjects  reflect  processes  like  those  in  GPS.  An 
illustration  is  in  Table  1.  A  segment  of  one  subject's  protocol  is  shown, 
along  with  a  trace  of  a  version  of  GPS  working  on  the  same  problem.  In  the 
protocol  and  the  GPS  trace,  LO  refers  to  the  goal  expression  and  LI  refers 
to  the  initial  expression  of  the  problem.  L2,  L3,  and  so  on  refer  to 
additional  expressions  that  are  generated  by  the  problem  solver  by  applying 
operators  to  LI  and  other  previously  generated  expressions. 

The  operators  that  are  referred  to  in  this  segment  are 

R6 :  AaB  <vAV  B 

R7:  A  V (B  •  C)  (AVB)  •  (AyC) 

A.(ByC)  4—»  (A  •  B)  V(A  •  C) 

The  protocol  segment  in  Table  1  began  near  the  end  of  the  first  minute  of 
work  on  the  problem,  and  occupied  a  little  more  than  three  minutes. 

In  this  segment,  both  the  subject  and  GPS  had  the  goal  of  deleting  the 
letter  R  from  the  initial  expression.  Both  of  the  problem  solvers 
considered  rule  R7  as  a  possible  means  of  accomplishing  this.  R7  cannot  be 
applied  to  LI  because  its  connectives  are  wrong,  so  a  subgoal  was  set  to 
change  the  connective  of  LI.  This  led  to  use  of  R6,  but  the  two 
occurrences  of  R  in  the  transformed  expression  have  opposite  signs. 

Attempts  were  made  to  change  one  of  the  signs,  but  this  returns  the 
horseshoe  to  the  subexpression.  At  this  point  the  subject,  and  the 
specific  version  of  GPS  that  produced  this  run,  were  both  unable  to 
continue  on  this  line  of  work. 

This  protocol  and  GPS  trace  are  similar  to  an  impressive  degree  of 
detail.  However,  the  important  finding  is  not  the  fact  that  the  subject 
and  GPS  tried  to  use  the  same  rules  in  the  same  sequence.  The  precise 
sequence  of  rules  used  by  GPS  can  be  tailored  fairly  arbitrarily,  and 
indeed  other  versions  of  GPS  would  not  try  to  use  R6  and  R7  in  this 
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Comparison  of  GPS  with  Protocol  Data 
(from  Newell  L  Simon,  1972) 


GPS  Trace  Subject  Protocol 

LO:-  OvQ*P) 

LI :  (R  5  "-P)  •  ('vR  =  Q) 

Goal  1:  Transform  LI  into  LO 
Goal  2:  Delete  R. from  LI 


Goal  2:  (reinstated) 

Goal  9:  Apply  R7  to  LI 

Goal  10:  Change  connective  to  V 
in  left (LI) 

Goal  11:  Apply  R6  to  left(LL) 
Produce  Lw ; 

(A/R  V  "-P)  •  ('“-R  =  1 


Goal  12:  Apply  R7  to  L^ 

Gcal  13:  Change  connective  to  V 
in  right (L4) 

Goal  14;  Apply  R6  to 
right (L4 ) 

Produce  L5: 

("'-?.  V  -vP)*(R  V  Q) 


Gcal  15:  Apply  R7  to  L5 
Gcal  16:  Change  sign  of 

left (right (L*) ) 
Goal  17:  Apply  R6  to 
right (L5) 
Produce  L6: 

('-R  V  %PW'VR  3  Q  f 


Goal  IS:  Apply  R7  to  L6 

Goal  19:  Change  connective  to 
V  in  right (L6) 

reject 

Goal  16:  (reinstated) 
nothing  more 
Goal  13:  (reinstated) 
nothing  more 
Goal  10:  (reinstated) 
nothing  more 


Now  I’m  looking  for  a  way  to  get  rid 
of  the  horseshoe  inside  the  two 
brackets  that  appear  on  the  left 
and  right  sides  of  the  equation. 

And  I  don't  see  it. 

Yen,  if  you  apply  R6  to  both  sides 
of  the  equation. 

From  there  I'm  going  to  see  if  1  can 
apply  R7. 

(E  writes  L2 :  (n-R  V  »*?) .  (r  v  Q)) 

I  can  almost  apply  R7,  but  one  R  needs 
a  tilde.  So  I'll  have  to  look  for 
another  rule. 

I'm  going  to  see  if  I  can  change  that  R  to 
a  tilde  R.  As  a  matter  of  fact,  I  should 
have  used  R6  on  only  the  left  hand  side 
of  the  equation.  So  use  R6,  but  only  on 
the  left  hand  side. 

(E  writes  L3:  ("vR  V  «,?)  .  =.  Q)) 

Now  I'll  apply  R7  as  it  is  expressed. 

Both., excuse  me,  excuse  me,  it  car.':  be 
done  because  of  the  horseshoe  So... 
now  I'm  looking. . .scanning  the  rules 
here  for  a  second,  and  seeing  if  I 
can  change  the  R  to  a  'vR  in  the  second 
equation,  but  I  don’t  see  any  way  of 
doing  it. 

(Sigh)  I'm  just  sort  of  lost  for  a  second. 


Section  II,  Well  Specified  Problems 
Novel  Problems  with  Specific  Goals 


Page  15 


situation.  The  important  finding  involves  the  general  character  of  the 
subject's  performance,  involving  goals  related  to  differences  between  the 
current  expression  and  the  problem  goal  and  subgoals  to  make  operators 
applicable.  The  protocol  provides  several  clear  illustrations  of 
activities  that  are  consistent  with  the  hypothesis  of  a  GPS-like 
problem-solving  process. 

Problem  Behavior  Graphs.  It  is  important  to  consider  whether 
activities  like  those  in  Table  1  are  typical  of  problem  solvers,  or  are 
relatively  rare.  Newell  and  Simon  addressed  this  question  by  examining 
Problem  Behavior  Graphs  (PBGs)  obtained  from  the  protocols  of  several 
subjects  working  on  a  moderately  difficult  problem. 

Figure  1  here 


An  example  of  a  PBG  is  shown  in  Figure  1.  The  numbers  prefixed  by  3 
on  the  left  correspond  to  lines  of  the  transcribed  protocol.  This  PBG  was 
obtained  from  the  protocol  that  includes  the  segment  given  In  Table  1, 
which  corresponds  to  the  section  of  the  PBG  starting  at  BIO  and  ending  just 
before  B29.  Information  Included  in  the  cognitive  states  is  in  the 
rectangles;  operators  are  shown  on  the  lines  that  connect  the  rectangles. 
Information  in  the  rectangles  refers  to  new  expressions  that  were  written 
(e.g.,  L2  or  L3,  indicated  in  the  protocol),  or  differences  between  a 
current  expression  and  the  goal  that  the  subject  was  considering.  For 
example,  "Ag"  refers  to  a  difference  in  grouping  of  terms  and  "Ac  l&r" 
refers  to  the  difference  between  connectives  in  the  given  expression  and 
the  goal  of  applying  R7  (horseshoes  in  both  the  left  and  right  sides  of  LI 
jmd  wedges  or  dots  needed  to  apply  R7) . 

Most  of  the  operators  refer  to  the  rules;  we  mentioned  R6  and  R7 
earlier.  When  a  rule  is  applied  successfully,  there  is  an  arrowhead  on  the 
line  between  rectangles.  When  a  rule  is  shown  with  a  line  without  an 
arrowhead,  there  was  a  goal  to  apply  the  rule  but  It  did  not  succeed. 

Double  lines  indicate  repetitions  of  attempts  to  apply  rules. 

The  relation  between  the  protocol  and  the  PBG  can  be  illustrated  by 
examining  the  first  few  lines  of  Table  1  and  the  PBG  starting  at  BIO.  "get 
LO"  refers  to  consideration  of  the  goal;  this  led  to  recognition  of  the 
difference  In  grouping  between  LO  and  LI  ("  Ag").  Then  the  subject 
attempted  to  apply  R7;  this  led  to  identifying  the  differences  in 
connectives  noted  in  the  third  rectangle  ("Ag  l&r").  Then  an  attempt  to 
apply  R6  was  successful,  resulting  in  line  L2.  The  subject  attempted  to 
apply  R7  a  second  time  and  noticed  that  there  was  a  difference  in  the  signs 
of  the  R  terms  in  the  two  subexpressions  ("  AsR").  From  time  to  time,  the 
subject  "backed  up"  to  an  earlier  state,  as  when  he  decided  that  R6  should 
be  applied  only  to  the  left  side  of  LI.  This  is  indicated  by  a  vertical 
line  from  the  cognitive  state  that  the  subject  returned  to.  R6  was  applied 
to  the  left  subexpression  of  LI,  giving  line  L3;  then  R7  was  attempted 
again,  but  the  subject  noticed  the  horseshoe,  an  incorrect  connective  for 
R7.  The  subject  returned  to  the  goal  of  changing  the  sign  of  R  in 
expression  L2,  but  the  search  for  an  appropriate  rule  (indicated  by  R  In  a 
box)  failed  to  produce  anything  helpful. 
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Table  2  here 


PSGs  were  co.  piled  for  seven  subjects,  working  on  the  problem  in  Table 
1.  The  transitions  between  states  were  classified,  and  the  categories  were 
compared  with  activities  that  occur  when  GPS  works  on  a  problem.  The 
categories,  and  their  frequencies  in  the  seven  PBGs ,  are  shown  in  Table  2. 

Most  of  the  categories  shown  in  the  table  correspond  to  GPS-like 
activities.  Those  that  do  not  are  marked  with  asterisks,  accounting  for 
about  18%  of  the  transitions  in  the  PBGs.  The  most  interesting 
disc repancies  Involved  choice  of  operators  to  avoid  undesireable 
consequences  ("avoid  consequences"),  and  noticing  features  of  the  problem 
not  related  to  the  present  goal  ("noticing").  Simulation  of  these  would 
require  significant  additions  to  GPS's  problem-solving  processes.  Tie 
remaining  discrepancies  involve  activities  that  relate  to  the  requirement 
of  giving  protocols  ("command  experimenter"  and  "review")  or  where  there 
was  insufficient  Information  in  the  protocol  to  determine  whether  the 
transition  was  related  to  one  of  the  GPS-like  categories  ("other,"  except 
for  those  in  the  subcategory  "noticing"). 

Aggregate  Frequencies .  The  data  in  Table  2  were  obtained  from  a  small 
group  of  subjects  who  were  required  to  think  aloud  as  they  worked.  It  is 
possible  that  the  subjects  were  atypical,  or  that  the  requirement  of 
thinking  aloud  caused  major  distortions  in  the  way  in  which  problem  solving 
occurred . 

Newell  and  Simon  compared  some  summary  statistics  from  their  subjects 
with  data  obtained  by  Carpenter  eh  al  (1961)  at  Yale  University.  The 
number  of  subjects  run  at  Yale  was  larger  (64),  and  they  solved  the 
problems  with  pencil  and  paper,  without  thinking  aloud.  If  the  data  for 
the  Carnegie  subjects  did  not  differ  from  the  Yale  data  in  significant 
ways,  then  there  is  evidence  that  the  general  characteristics  of  their 
problem  solving  were  not  caused  by  Individual  idiosyncracies ,  or  by  the 
requirement  of  giving  protocols  while  working  on  the  problems. 

The  summary  statistics  involved  a  division  of  expressions  into 
categories.  Each  category  consists  of  an  expression  from  the  problem,  such 
as  the  left  subexpression  of  expression  LI,  and  other  expressions  that  can 
be  formed  from  it  by  making  minor  transformations.  Minor  transformations 
for  this  purpose  are  those  involving  rules  that  change  the  order  of  terms, 
the  connectives,  or  the  signs,  but  do  not  change  the  terms  in  an 
expression.  The  data  for  each  group  of  subjects  are  the  proportions  of  all 
the  expressions  written  that  fall  into  the  categories.  The  categories  of 
expressions  are  listed  in  the  left  column.  For  example,  expressions  in 
Class  LI  are  those  that  can  be  formed  by  applying  one  of  the  minor 
transformations  to  expression  LI  shown  in  Table  l.  The  categories  that 
were  used  are  not  arbitrary;  they  are  motivated  by  the  observation  that 
differences-  that  require  changing  the  terms  in  expressions  are  more 
difficult  to  remove,  and  thus  require  higher  priority  in  solving  the 
problems.  (Also  see  the  discussion  of  planning,  which  follows.) 
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Total  Frequencies  of  Occurrences  of  GPS-Like  Mechanisms 
in  Seven  Protocols  (From  Newell  &  Simon,  1972' 


Category 

Means-end  analysis 

towards  goal  object 
operator  applicability 

overcome  difficulty 
further  specify 
resolve  uncertainty 
*avoia  consequences 

avoid  difficulty 
prepare  desired  result 

Working  forward 

systematic  scan  and  evaluate 
input  form  similarity 
do  something  different 

Working  backward 

output  form  similarity 

Repeated  application 
after  subgoal 

to  overcome  difficulty 
to  further  specify 
to  resolve  uncertainty 
to  avoid  consequences 
to  correct  error 
to  process  interruption 

imp lemento  t ion 
for  plan 

*to  command  experimenter 


Other 


*not icing 

*repeated  application 
*nev  application 


Total 


Frequency 
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Table  3  here 


Data  for  the  problem  in  Table  1  are  shown  in  Table  3.  Tie  sgreement 
between  the  two  groups  of  subjects  was  not  exact,  but  the  comparison  does 
not  indicate  major  differences  in  problem-solving  processes.  A  statistical 
test  shows  that  the  difference  between  the  category  frequencies  in  the  two 
groups  was  not  significant  (X2(4)  =  8.86;  p  >  .05).*! 


Table  4  here 


Data  are  shown  in  Table  4  for  a  somewhat  harder  problem,  In  which  the 
given  expression  was  LI  =  (?v  Q)  *  (Q=R),  and  the  goal  was  L0  =  Pv(Q<  R) . 
Again,  the  agreement  is  not  exact,  but  the  difference  is  not  large  enough 
to  reject  the  hypothesis  that  the  two  sets  of  responses  were  produced  by  a 
single  underlying  process  (  (x2(3)  *  15.27,  p  >  .05). 

Planning  Strategy.  A  second  strategy  of  broad  appl icabil i ty  and  wide 
use  that  was  Identified  in  the  logic  protocols  is  planning.  The  idea 
underlying  the  planning  strategy  is  that  some  gaps  between  the  initial 
situation  and  the  go^.  are  more  important  and  potentially  harder  to  remove 
than  others.  If  the  problem  space  is  simplified  by  abstracting  the  problem 
expressions,  removing  from  them  the  less  important  features,  the  simplified 
expressions  will  define  a  much  smaller  space  through  which  the  search  can 
be  conducted  more  expeditiously.  If  a  solution  can  be  found  to  the 
simplified  problem,  then  the  omitted  details  can  be  restored  and  this 
solution  used  as  a  guide  for  searching  in  the  original  problem  space. 

To  use  the  planning  strategy  subjects  must  not  only  be  able  to  apply 
means-ends  analysis,  but  must  have  enough  knowledge  of  the  problem  space  to 
be  able  to  distinguish  "important"  from  "unimportant"  differences  between 
expressions.  For  example,  in  the  domain  of  logic,  subjects  gradually  learn 

that  it  is  easier  to  change  the  connectives  in  logic  expressions  than  to 

change  the  letters.  The  planning  space  is  then  a  space  in  which 
expressions  like  (R  3  %  ?)  .  (  \R  =  Q)  are  replaced  by  (RP)(RQ).  The 
sequences  of  proof  steps  in  the  original  space,  R  =  n.  P,  c.  R  c  Q, 

'■,Q  =  R,  c,Q  =  vP,  Q  V  \P,  "j  (  •  P)  ,  becomes  the  simpler  sequence  in 

the  planning  space,  RP,  RQ,  PQ.  The  second  step  of  the  search  in  the 

planning  space  corresponds  to  two  separate  steps  in  the  original  space,  and 
the  third  step  in  the  planning  space  corresponds  to  three  steps  in  the 
original  space  —  a  reduction  of  one-half  in  the  length  of  the  derivation, 
and  of  a  much  larger  factor  in  the  amount  of  search  required  to  find  it. 

Evidence  for  planning  was  obtained  in  protocols  like  the  following, 
obtained  in  a  problem  with  four  given  expressions:  LI  =*  PVQ; 

L2  »  -,Ra  V};  L3  -  S;  L4  -  R  =>  AS;  and  the  goal:  L0  -  PVT.  Rule  R9, 
mentioned  in  the  protocol,  Is  A  — ^  AyX,  a  rule  for  adding  a  term  to  an 


*1,  The  independence  assumption  of  the  chi-square  test  was  not  met  in 
these  data,  since  several  expressions  were  written  by  each  subject. 
However,  this  would  generally  make  it  more  likely  that  a  significant 
difference  would  be  obtained,  so  the  conclusion  seems  warranted. 


Table  4 


Proportions  of  Expressions 


Class  of 
Expressions 

LI 

extended  LI 
left  of  LI 
right  of  LI 
(RV  P) 

CP  v  Q)  •  (PV  R) 
LO 

Rule  9 


Ga  rnegie 
(97  expressions) 

.  3  J 
.02 
.14 
.14 
.13 
.03 
.03 
.16 


Yale 

(487  expressions) 

.2a 
.04 
.19 
.  1 3 
.07 
.01 
.01 
.18 


other 


01 


07 
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expression. 

Well,  one  possibility  right  off  the  bat  is  when  yon  have  just 
a  PVT  like  that  the  last  thing  you  -light  use  is  that  R9.  I  can 
get  everything  down  to  a  P  and  just  add  a  VT.  So  that's  the  one 
thing  to  keep  in  mind. 

Vail,  maybe  right  off  the  bat,  I'm  kinda  jumping  into  it,  I  maybe 
can  work  everything  down  to  just  a  ?;  I  dunno  if  that's  possible. 
But  I  think  it  is,  because  I  see  that  steps  2  and  4  are  somewhat 

similar;  if  I  can  cancel  out  the  R's,  that  would  leave  me  with 

just  an  S  and  Q; 

and  if  I  have  just  an  S  and  Q,  I  can  eventually  get  step  3,  get  the 
S' s  to  cancel  out  and  end  up  with  just  a  Q; 

and  if  I  end  up  with  just  a  Q,  maybe  the  J '  s  will  cancel  out;  so 
you  see,  all  the  way  down  the  line.  I  dunno,  it  looks  too  good  to 
be  true,  but  I  think  I  see  it  i l ready. 

IT. A. 2.  Water-Jar  Problems.  We  now  discuss  in  analysis  of  problem 
solving  in  another  task.  Water-jar  problems,  studied  extensively  by 
Luchins  (1942),  are  transformation  problems  with  definite  goals,  involving 
a  set  of  three  jars  of  different  capacities.  In  the  form  studied  by  Atwood 

and  Poison  (1976),  the  largest  jar  Is  full  in  the  initial  state,  and  the 

goal  Is  to  have  that  water  divided  equally  between  two  jars.  For  example, 
the  capacities  may  be  Jar  A:  3  oz. ;  Jar  3:  5  oz.;  Jar  C:  3  oz.  Then  in 
the  initial  state,  Jar  A  contains  3  oz.  of  water,  and  Jars  B  and  C  are 
empty.  The  goal  is  to  have  4  oz.  of  water  each  in  Tars  A  and  3.  Tie 
problem-solving  operators  involve  pouring  water  from  a  source  jar  into  a 
target  jar.  Water  can  be  poured  into  the  tirget  jar  until  it  is  full,  if 
there  is  enough  water  in  the  source  jar;  water  c  in  be  poured  out  of  the 
source  jar  until  it  is  empty,  if  there  is  enough  room  in  the  target  jar. 
Intermediate  actions  are  not  possible. 

In  the  water-jar  task,  differences  between  iny  state  and  the  problem 
goal  consist  of  I  i sc repanc ies  between  the  contents  of  the  three  jars  in 
that  state  and  the  contents  that  are  specified  in  the  goal.  Atwood  and 
Poison  hypothesized  that  subjects  would  judge  their  progress  by  combining 
the  discrepancies,  forming  in  overall  evaluation  function  for  the  current 
state,  and  would  try  to  select  moves  that  would  improve  the  vilue  of  this 
function.  They  assumed  that  the  evaluation  of  a  specific  state  i  was 
e  .  =  iC.  (A3  -  G ( A .)  I  4-  !  C  .  (B)  -  G < B  )  I  , 

where  C^iA)  and1  Cj_(B)  are  the  actual  contents  of  Jar  A  and  Tar  B  in  state 
i,  and  G( A)  and  G(B)  are  the  contents  of  Jar  A  and  Tar  3  in  the  goal  state. 
(The  contents  of  Jar  C  are  reiundant  with  those  of  A  and  B.) 

Atwood  and  Poison  formulated  a  process  model,  based  on  the  means-ends 
strategy  of  attempting  to  reduce  the  evaluation  to  zero.  They  assumed  that 
at  each  move  subjects  consider  various  pouring  operations  that  could  be 
made  legally,  and  try  to  choose  one  that  will  ma’ci  the  evaluation  function 
smaller,  or  at  least  not  increase  its  current  value  by  more  than  a 
threshold  amount. *2  Atwood  and  Poison  also  made  specific  assumptions  about 
memory  capacity;  they  assumed  a  limited  short-term  memory  for  holding 
Information  about  states  that  would  be  produced  by  alternative  moves,  and 
they  issuraed  that  each  state  reached  in  solving  the  problem  was  stored  In 
long-term  nenory  with  a  fixed  probability. 
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The  model  also  specifies  a  sequence  of  processes  for  selecting  a  move. 
The  sequence  Includes  calculating  the  evaluation  function  for  alternative 
moves,  storing  Information  about  alternatives  in  STM,  recognlzLng  states 
that  have  occurred  before  on  the  basis  of  Information  In  LTM,  and  deciding 
whether  to  make  a  given  move  under  consideration.  Tne  assumptions  of  the 
model  allow  for  several  possibilities.  A  move  might  be  selected  If  Lt 
leads  to  an  acceptable  state;  this  was  assumed  to  be  less  ILkely  If  the 
state  was  recognized  as  having  occurred  before.  The  moves  stored  In  STM 
may  be  examined,  with  selection  of  the  stored  In  LTM  from  previous 
occurrences,  a  move  may  just  be  chosen  at  random  from  the  set  of  possible 
moves,  or  the  subject  may  decide  to  return  to  the  initial  state  of  the 
problem. 

Atwood  and  Poison  tested  their  model  with  data  obtained  from  groups  of 
human  subjects  who  solved  different  versions  of  the  problem.  Problems  were 
presented  at  computer  terminals  and  records  were  kept  of  the  moves  made  by 
each  subject.  The  model  was  implemented  as  a  computer  program  which  was 
run  with  various  values  of  the  parameters.  Because  the  model  contains 
probabilistic  processes,  it  does  not  produce  a  single  sequence  of  moves  In 
solving  a  problem.  The  model  was  run  many  times  with  each  set  of  parameter 
values,  and  a  summary  of  Its  performance  was  obtained,  consisting  of  the 
average  frequency  of  each  of  the  possible  problem  states.  A  set  of 
parameter  values  was  chosen  for  which  the  set  of  frequencies  for  two 
problems  (jar  sizes  of  8,5,3  and  24,21,3)  approximated  the  frequencies 
obtained  from  the  human  subjects.  The  parameter  values  that  were  chosen 
seem  quite  reasonable.  The  size  of  STM  was  set  at  three  alternative  moves; 
states  reached  in  the  problem  were  stored  In  LTM  with  probability  .90;  and 
the  threshold  of  acceptability  for  a  new  state  was  set  at  1.0  above  the 
value  of  the  current  state. 


Figure  2  here 


Results  of  the  simulation  are  shown  in  Figure  2.  Each  set  of 
predictions  was  based  on  running  the  model  250  times.  The  data  for  each 
problem  were  from  a  group  of  about  40  subjects,  different  from  the  data 
used  to  estimate  the  parameters.  One  problem,  (8,5,3),  was  used  in 
estimation,  but  the  other  three  problems  were  different.  Tne  model 
correctly  predicted  the  order  of  difficulty  of  these  four  problems.  For 
two  of  the  problems,  (3,5,3)  and  (12,7,4),  the  detailed  predictions  of 
response  frequency  were  satisfactorily  close  to  the  data  by  a  statistical 
test.  For  the  two  harder  problems,  although  the  general  shapes  of  th 
frequency  distributions  agreed  with  the  data,  the  model  erred  In  predicting 


*2.  This  strategy  dlfferes  from  the  means-ends  strategy  of  GPS  in  one 
significant  respect.  GPS  considers  all  the  ways  in  which  the  current  state 
and  the  goal  differ,  and  selects  a  move  to  reduce  the  most  important  of 
these  qualitative  differences.  Atwood  and  Poison's  model  combines  the 
differences  into  a  single  numerical  index,  the  value  of  the  evalution 
function,  and  tries  to  reduce  that  difference  by  at  least  a  threshold 
amount.  This  difference  probably  does  not  have  a  significant  effect  on 
predictions  of  performance  in  the  water-jar  task,  but  there  are  situations 
in  which  strategies  based  on  global  evaluations  and  on  individual 
qualitative  differences  would  lead  to  significantly  different  performance. 


S>  Problem 


Figure  2.  Observed  and  predicted  values  of  mean  visits  per  State  for  four  water  |.n  ptolilems  (from  Atwood  &  Poison,  1970). 
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too  many  returns  to  states  at  the  beginning  of  a  path  that  led  to  the  goal. 
As  Atwood  and  Poison  noted,  this  defect  could  be  corrected  by  making  the 
probability  of  recognizing  a  previous  state  depend  on  the  number  of  times 
it  has  been  encountered. 

Conclusions .  Section  II. A  has  been  concerned  with  problem  solving  in 
situations  that  are  novel  to  the  problem  solver,  in  which  a  definite  goal 
and  the  set  of  legal  problem-solving  operators  are  described  by  the 
instructions.  The  situation  requires  using  some  general  problem-solving 
strategy.  The  findings  show  that  in  situations  of  this  kind,  the  strategy 
of  means-ends  analysis  represents  the  major  feature  of  human 
problem-solving  performance.  In  this  section  we  have  discussed  evidence 
consisting  of  Individual  thlnking-aloud  protocols  and  aggregate  response 
frequencies  in  two  tasks.  Findings  fitting  this  general  pattern  have  been 
obtained  in  a  wide  range  of  problem-solving  tasks,  including  puzzles  such 
as  the  Tower  of  Hanoi  (Anzai  &  Simon,  1979)  and  physics  textbook  problems 
(Simon  A  Simon,  1978),  which  we  discuss  below  in  Section  II. C. 

Means-ends  analysis  is  perhaps  the  single  most  important  strategy  that 
people  employ  for  searching  selectively  through  large  problem  spaces.  The 
selectivity  is  powerful  because  it  points  search  in  the  direction  of  the 
goal,  selecting  operators  on  the  basis  of  their  relevance  to  reducing  the 
distance  from  that  goal.  Use  of  means-ends  analysis  requires  some 
domain-specific  knowledge;  for  example,  it  can  be  employed  efficiently 
only  if  the  subject  has  learned  enough  about  the  problem  domain  to  have 
associated  particular  differences  with  particular  operators  for  removing 
them.  However,  it  is  basically  a  "weak  method,”  applicable  in  situations 
where  the  problem  so’ver  has  little  specific  knowledge  based  on  experience 
In  the  problem  domain. 
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II. B.  Doma ln-Specif lc  Knowledge  for  Familiar  Problems  with  Specified  Goals 

We  now  turn  to  problems  that  are  solved  by  Individuals  who  have 
specialised  knowledge,  acquired  either  through  instruction  or  practice.  We 
will  discuss  problem  solving  in  a  domain  of  school  uathematlcs ,  high  school 
geometry.  Then  we  will  discuss  a  phenomenon  that  has  been  salient  in  the 
problem-solving  literature,  problem-solving  set  or  Einstellung,  which  we 
interpret  as  resulting  from  domain-specific  knowledge  structures. 

II. 3.1.  Geometry  Exercises.  In  school  subjects  such  as  geometry, 
knowledge  for  solving  problems  is  imparted  intentionally  through 
instruction.  Research  conducted  by  Greeno  (1978)  had  the  goal  of 
Investigating  and  characterizing  the  knowledge  that  is  acquired  by  students 
who  learn  successfully  In  the  course. 

The  main  data  were  obtained  in  a  series  of  interviews  conducted 
approximately  once  each  week  with  six  students  who  were  taking  a  standard 
high  school  course  in  geometry.  In  each  interview,  an  individual  student 
worked  for  about  20  minutes,  during  which  he  or  she  typically  solved  three 
or  four  problems.  Most  of  the  problems  that  were  solved  were  typical  of 
homework  or  test  problems  that  the  students  were  working  on  at  that  time  In 
the  course.  Students  were  asked  to  think  aloud  as  they  worked,  and  their 
protocols  were  recorded  and  transcribed. 


Figure  3  here 


One  of  the  problems  solved  In  an  early  session  (during  the  second 
month  of  the  course)  is  shown  in  Figure  3.  The  problem  as  It  was  presented 
Is  shown  in  the  upper  left.  The  upper  right  diagram  provides  notation  for 
referring  to  the  various  angles  in  the  diagram.  The  seven  steps  shown 
below  the  diagrams  are  a  formal  solution  with  inferences  and  justifying 
reasons.  The  students  were  not  required  to  write  the  solution  steps  of 
this  problem  formally  but  they  were  required  to  state  aloud  the 
intermediate  inferences  that  they  made.  Most  of  the  students  solved  the 
problem  in  Figure  3  correctly.  We  will  discuss  specific  aspects  of  their 
solutions  below.  They  were  generally  similar  to  the  solution  shown  in 
Figure  3. 

The  solution  shown  in  Figure  3  was  given  by  a  computational  model 
called  Perdix  that  was  formulated  to  simulate  the  students'  performance. 

The  structures  and  processes  represented  In  Perdix  are  hypotheses  about  the 
knowledge  that  students  acquire  in  a  geometry  course. 

Problem-Solving  Knowledge .  Perdix  contains  three  kinds  of  knowledge, 
all  represented  as  production  rules:  (1)  problem-solving  operators  that 
make  inferences,  (2)  perceptual  concepts  that  recognize  patterns  In 
diagrams,  and  (3)  strategic  processes  that  set  goals  and  select  plans  for 
problem-solving  activity. 

Problem-solving  operators  in  geometry  correspond  to  the  theorems, 
postulates,  and  definitions  that  are  used  as  reasons  to  justify  steps  in  a 
problem  solution.  Examples  include  "Vertical  angles  are  congruent"  (a 
theorem),  "Corresponding  angles  are  congruent"  (a  postulate),  and  "If  two 
angles  are  supplementary,  the  3um  of  their  measures  is  180*"  (a 
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Statement  Reason 

1.  meas .  A1  (P)  *  40*  1.  Given 

2.  Ai  (P)  =  A6  2.  Vertical  angles 

3.  A6  i  A8  3.  Corresponding  angles 

4.  A8  supplem.  A12  (Q)  4.  Interior  angles  on  same  side 

5.  A6  supplem.  A12  (Q)  3.  Substitution 

6.  Al  (P)  supplem.  A12  (Q)  6.  Substitution 

7.  meas.  A12  (Q)  ■  140°  7.  Definition  of  supplem. 


Figure  3.  A  solved  problem  in  geometry.  (Al,  A6,  etc.  in  the 
solution  refer  to  the  positions  of  angles  in  the  upper  right  diagram.) 
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definition).  When  the  antecedent  of  one  of  these  propositions  is  satisfied 
in  a  problem,  then  the  consequent  can  be  inferred.  For  example,  because  A  l 
and  A6  are  vertical  angles  in  Figure  3,  the  inference  that  A1  and  A6  are 
congruent  is  permitted.  The  propositions  that  correspond  to  the 
problem-solving  operators  are  prominent  in  geometry  Instruction.  They  are 
represented  in  Perdix  as  production  rules,  with  the  antecedents  as 
conditions  and  the  relations  that  can  be  inferred  as  actions. 

Patterns  of  information  in  the  problem  have  to  be  recognized  to 
determine  that  a  problem-solving  operator  can  be  applied.  For  example,  to 
apply  the  Inference  rule  "Vertical  angles  are  congruent"  In  Figure  3  and 
thus  infer  that  Al  and  A6  are  congruent,  the  problem  solver  must  first 
recognize  that  Al  and  A6  are  vertical  angles.  In  the  geometry  course, 
perceptual  concepts  are  taught  with  examples  using  diagrams.  In  Perdix, 
knowledge  for  recognizing  patterns  is  represented  by  discrimination 
networks,  similar  to  the  structures  in  the  Elementary  Perceiver  and 
Memorizer,  EPAM  (Feigenbaum,  1963)  and  the  Concept  learning  System,  CLS 
(Hunt,  Marin  A  Stone,  1966).  Perdix' s  recognition  system  is  based  on 
features  of  a  diagram,  such  as  sides  of  two  angles  that  are  collinear, 
along  with  other  Information  that  may  be  given  or  inferred,  such  as 
statements  that  lines  are  parallel  or  perpendicular.  An  example  Is  shown 
in  Figure  4,  which  represents  the  process  that  can  recognize  a  pair  of 
vertical  angles,  a  pair  of  angles  formed  by  bisecting  an  angle,  and  other 
patterns  that  Involve  pairs  of  angles  that  have  a  single  vertex. 


Figure  4  here 


Strategic  knowledge  is  needed  for  setting  goals  that  organize 
problem-solving  activity.  In  the  example  problem  of  Figure  3,  the  main 
goal  is  to  find  the  measure  of  angle  Q.  This  cannot  be  achieved  directly, 
and  the  problem  solver  must  know  that  a  way  of  finding  the  measure  of  an 
angle  is  to  find  a  quantitative  relationship  (e.g.,  congruent  or 
supplementary)  of  the  unknown  angle  with  one  that  has  a  known  measure. 

This  can  be  represented  as  a  production:  when  the  current  goal  is  to  find 
the  measure  of  an  angle,  and  the  measure  of  another  angle  is  known,  set  a 
subgoal  of  finding  a  quantitative  relation  between  the  unknown  angle  and 
the  known  angle. 


Table  5  here 


The  importance  of  strategic  knowledge  is  illustrated  in  the  protocol 
in  Table  5.  The  student  was  working  on  the  problem  shown  in  Figure  3.  The 
student  marked  several  angles  Ln  a  copy  of  the  diagram;  these  are 
Indicated  in  the  protocol  ln  parentheses  in  relation  to  the  diagram  in  the 
upper  right  part  of  Figure  3.  For  example,  "P  would  equal  one  r-^Al)" 
indicates  that  a  label  "1"  was  written  on  the  angle  in  the  student's 
diagram  at  position  Al . 

The  student  seems  to  have  known  the  problem-solving  operators  and  the 
geometric  patterns  needed  to  apply  them  (this  was  confirmed  in  another  part 
of  the  interview)  but  was  unable  to  solve  the  problem.  The  most  likely 
hypothesis  is  that  the  student  lacked  knowledge  of  the  problem-solving 
strategy  needed  In  this  problem.  The  strategy  involves  forming  a  chain  of 


Start:  Bind  Vt  ami  Vt  I 
to  angle*;  VJ  to  their 
shared  vertex. 
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Table  5 

Protocol  of  an  Attempt  to  Solve  Figure  3 

S:  All  rignt.  I  would  put,  like,  P  would  equal  one  (-*Alj. 

E:  Okay. 

S:  And  then,  two  (—?A6). 

E:  Put  in  two  there,  right. 

S:  And  then  three  (— *A15);  no,  wait  —  three  C  — ^ A 1 5 1  and  tour  (— M12),  i 
guess . 

E:  Okay.  Now,  why  did  you  put  two  tnere? 

S:  Well,  I  don't  know.  It  could  nave  something  to  do  with  vertical 

angles . 

E:  Okay. 

•  •  • 

S:  All  right,  the  first  thing  I  guess  I  should  try  to  do,  I  would  try  to 
find  if  there  were  any  alternate  interior  or  corresponding  angles? 

E :  Okay . 

S:  Or  any  of  those. 

E:  Mn-hm. 

S:  I  guess  I  would  say  that  ...  well,  wait  a  minute.  I  guess  maybe  1 

would  put  five  there  (->A16). 

E:  Okay. 

S:  1  don't  know  if  I  would  need  this. 

E:  Okay. 

S:  These  two  are  supplementary. 

E:  Right. 

S:  That  doesn't  help  much.  And  then,  the  measure  of  angle  five  ...  would 
it  equal  the  measure  of  angle  one? 

E:  Well,  you  might  have  to  work  that  out. 

S:  How  ...  if  this  equals  ...  this  equals  forty. 

E:  That's  right. 

S:  Oh,  all  right.  Wait,  the  measure  ...  I  can't,  I  don't  know. 

I  don't  know  how  to  do  these. 


E:  Okay. 
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angles  that  are  related  by  congruence.  Knowledge  of  this  strategy  Involves 
setting  a  series  of  goals;  when  the  problem  requires  a  relation  between 
two  angles,  and  none  can  be  recognized,  then  find  an  angle  related  to  one 
of  them  by  congruence  and  try  to  relate  that  angle  to  the  other  angle. 

This  strategic  procedure  can  be  applied  recursively  until  an  angle  is  found 
that  is  related  to  the  goal  angle  by  one  of  the  geometric  relations  from 
which  a  quantitative  relation  can  be  inferred. 

Four  of  the  six  students  who  were  interviewed  in  Greeno's  study  solved 
the  problem  in  Figure  3  successfully,  apparently  having  acquired  the 
strategy  of  forming  a  chain  of  congruent  angles.  About  a  week  after  giving 
the  protocol  in  Table  5,  that  student  also  was  successful  in  solving  a 
problem  that  required  the  chaining  strategy.  Hie  students  differed  in  the 
specific  sequences  of  angles  that  they  used,  which  could  be  the  result  of 
differences  In  the  way  that  they  scanned  the  diagram  looking  for  angles  to 
add  to  the  chain,  or  to  differences  in  the  ease  with  which  different 
students  recognized  various  geometric  patterns.  About  a  week  after  giving 
the  protocol  in  Table  5,  that  student  was  successful  in  solving  a  different 
problem  that  also  required  che  chaining  strategy. 

In  geometry  instruction,  very  little  strategic  knowledge  is  taught 
explicitly;  it  has  to  be  inferred  by  the  students  from  example  problems. 

We  believe  that  this  is  a  common  feature  of  instruction  in  domains 
requiring  acquisition  of  knowledge  for  problem  solving,  and  we  consider  the 
explicit  teaching  of  problem-solving  strategies  as  a  potentially  productive 
development  for  instruction,  based  on  the  results  of  basic  research  on 
cognitive  processes  in  problem  solving. 

Strategic  knowledge  is  represented  in  Perdix  by  productions  that 
select  plans  for  work  on  problems.  A  plan  is  a  general  approach  to  the 
problem,  based  on  Information  in  the  problem  situation.  GPS  forms  such 
plans  using  its  general  planning  strategy,  described  on  pages  19-20. 

Perdix  has  specific  cognitive  structures  for  plans  that  are  used  frequently 
for  geometry  problems.  Forming  a  chain  of  congruent  angles  is  one  such 
plan.  Another  is  using  congruent  triangles  to  prove  that  two  angles  or  two 
line  segments  are  congruent. 

The  organization  of  planning  knowledge  in  Perdix  is  similar  to  that 
developed  by  Sacerdoti  (1977),  called  a  procedural  network.  In  a 
procedural  network,  there  are  units  of  knowledge  corresponding  to  actions 
at  different  levels.  Each  of  these  knowledge  units  includes  information 
about  the  prerequisites  and  consequences  of  an  action  that  can  be 
performed.  In  Perdix,  knowledge  of  each  plan  includes  information  about 
goals  that  can  be  achieved  using  the  plan  (its  consequences),  conditions  in 
problems  that  make  the  plan  promising  (its  prerequisites),  and  subgoals 
that  should  be  set  if  the  plan  is  adopted. 

Perdix' 3  strategic  knowledge  constitutes  the  main  way  in  which  it 
differs  from  GPS.  Strategic  knowledge  in  GPS  is  the  general  means-ends 
strategy  that  can  be  used  in  any  domain  for  which  che  problem  solver  is 
taught  che  operators,  together  with  the  productions  that  connect  operators 
with  differences,  and  given  the  goal  of  a  problem.  The  hypothesis 
represented  in  Perdix  is  that  instruction  in  a  domain  such  as  geometry 
leads  to  acquisition  of  strategic  knowledge  specific  to  the  domain,  such  as 
the  schematic  knowledge  that  represents  plans  to  use  chains  of  congruent 
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angles  or  congruent  triangles.  Both  GPS  and  Perdix  construct  plans  that 
are  more  general  chan  the  actions  that  must  be  performed  in  solving  the 
problem.  The  difference  is  that  GPS  forms  plans  using  its  general 
means-ends  strategy,  while  Perdix's  plans  are  based  on  knowledge  of 
specific  geometry  strategies. 

When  GPS  plans,  it  use  the  strategic  process  of  means-ends  analysis  in 
a  problem  space  that  contains  features  taken  directly  from  the  basic 
representation  of  the  problem.  GPS's  planning  space  can  be  acquired  by 
learning  which  features  of  objects  should  be  given  first  priority.  In 
Perdix,  planning  uses  schematic  knowledge  of  specific  methods  applicable  to 
problems  in  the  domain  of  geometry.  These  schemata  include  general 
subgoals,  such  as  proving  that  triangles  are  congruent  or  finding  an  angle 
with  a  relation  based  on  parallel  sides,  that  can  be  used  as  intermediate 
steps.  The  associations  of  these  subgoals  with  the  goals  that  they  help  to 
achieve  have  to  be  acquired  by  students;  they  are  not  explicitly  given  as 
goals  of  problems  In  which  they  are  used. 

Solution  of  Ill-Structured  Problems .  A  hypothesis  that  is  consistent 
with  the  analysis  of  geometry  problem  solving  is  that  domain-specific 
strategic  knowledge  may  provide  the  main  basis  for  solving  ill-structured 
problems.  Problems  may  lack  definite  structure  for  many  reasons.  One 
Important  source  of  Lndefite  structure  is  that  a  problem  may  require 
knowledge  from  several  different  sources,  so  Its  solution  requires 
coordinated  work  in  several  disparate  problem  spaces  (Simon,  1973). 

A  modest  form  of  this  kind  of  problem  arises  in  geometry,  involving 
problems  that  require  construction  of  auxiliary  lines.  The  problem  space 
that  Is  presented,  including  a  diagram,  given  information,  and  a  goal  to  be 
proved,  must  be  augmented  in  order  for  the  problem  to  be  solved.  Greeno , 
Magone,  and  Chaiklin  (1979)  proposed  tiiat  solution  of  such  problems  can  be 
based  on  an  individual's  knowledge  of  plan  schemata.  In  the  model  Perdix, 
the  need  for  an  auxiliary  line  is  recognized  when  a  plan's  prerequisites 
are  partially  satisfied  in  the  problem  situation.  This  leads  to  definition 
of  a  subproblem;  the  goal  is  to  complete  the  pattern  of  features  that 
constitute  the  prerequisites,  and  this  goal  is  achieved  in  a  problem  space 
with  operators  that  are  appropriate  for  that  goal. 


Figure  5  and  Table  6  here 


An  example  Is  shown  in  Figure  5,  the  drawing  and  written  work  of  a 

student  on  the  following  problem:  "Prove  that  if  two  sides  of  a  triangle 

are  congruent,  then  the  angles  opposite  those  sides  are  congruent."  The 
protocol  given  by  this  student  is  in  Table  b.  After  drawing  the  triangle 
ABC,  the  student  added  the  line  CD,  which  is  not  specified  in  the  initial 
problem  space.  The  student's  comments  at  *1  and  *2,  along  with  the 
retrospective  comment  at  *3,  provide  evidence  that  construction  of  the 
auxiliary  line  was  related  to  a  plan  of  proof  involving  congruent 
triangles,  and  the  construction  completed  a  pattern  that  is  required  for 
that  plan  £o  be  applied,  that  is,  the  presence  of  two  trLangles  in  the 

diagram.  Perdix  simulates  solutions  like  this  with  a  process  of  patterns 

recognition  that  identifies  partial  patterns  of  two  triangles  missing  a 
line,  and  uses  special  problem-solving  operators  to  complete  the  patterns. 
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Figure  5.  Drawing  and  written  work  on  the  problem,  "Prove  that  if 
two  sides  of  a  triangle  are  congruent  then  the  angles  opposite 
those  sides  are  congruent." 
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Ta  b  1  e  6 

Protocol  for  the  Problem  of  Figure  5 


*1 


*2 


S:  Okay,  if  tvo  sides  of  a  triangle  are  congruent,  so  .  .  . 

draw  a  triangle. 

E :  Okay . 

S:  Then  the  angles  opposite  those  sides  are  congruent.  Okay,  so,  like, 
if  I  have  .  .  .  given:  triangle  ABC — I'll  letter  it  ABC. 

E:  Right. 

S:  And  then  I  have  .  .  .  prove:  ...  do  I  already  have  these  two  sides 
given?  Okay.  Two  sides  of  a  triangle  are  given. 

E :  Mmn—hnm. 

S:  Let  me  go  back  to  my  given  and  say  that  segment  AC  is  congruent  to 
segment  BC. 

E:  Okay. 

S:  And  I  want  to  prove  that  angle  A  is  congruent  to  angle  B. 

E ;  Good . 

S:  All  right.  Let  me  write  down  ay  given.  Okay.  And  mark  my 
congruent  sides.  Okay,  so,  I  want  to  prove  that  angle  A  is  congruent 
to  angle  B.  Now,  let’s  see.  Do  you  want  .  .  .  ? 

E:  Yeah.  Why  are  you  drawing  a  line  there? 

S:  I  don't  know  yet. 

E:  Oh,  that's  okay.  Don't  erase  it. 

S:  I'm  going  to  do  it,  no,  I  just  .  .  . 

E:  Oh,  okay,  fine. 

S:  Okay  .  .  .  okay,  then  I  could  ...  if  I  drew  a  line  .  .  . 

E:  Mmm-hcaa. 

S:  That  would  be  the  bisector  of  angle  AC3,  and  that  would  give  me 
.  .  .  those  congruent  angles  .  .  .  no.  (Pause.)  Yeah ,  well,  that  would 
give  me  those  congruent  angles,  but  I  could  have  the  reflexive  property, 
so  this  would  be  equal  to  that.  Okay,  I've  got  it. 

E :  Okay . 

S :  Okay . 
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Z:  Now,  before  you  go  ahead  and  write  it  all  down,  when  you  said  you 
were  going  to  draw  the  line  .  .  . 

S:  Yeah. 

E:  And  I  said  why  are  you  doing  that,  and  you  said  you  didn't  know 
yet,  what  do  you  think  happened  to  give  you  the  idea  of  making  it 
the  bisector? 

S:  Okay,  well,  I  have  to  try  to  get  this  ...  I  have  to  try  to  get 
triangle  ACD  congruent  to  BCD.  Because,  if  1  do  that,  then  angle  A 
is  congruent  to  angle  B  because  corresponding  parts  of  congruent 
triangles  are  congruent. 

E:  So  you  were  drawing  the  line  to  give  yourself  triangles,  is  that  the 

idea? 

S:  No,  to  ...  to  get  a  side  that  was  in  both  triangles. 

E :  Okay . 

S:  And  to  get  congruent  angles. 

E:  So  that's  why  you  drew  it  as  the  bisecotr. 

S:  Yeah. 
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Another  way  in  which  problems  can  be  ill-struv.  tured  involves  the  way 
in  which  goals  are  formulated.  Goals  in  well-structured  problems  are 
presented  as  specific  objects  (e.g.,  a  specific  logic  expression  to  be 
derived  or  a  specific  distribution  of  water  imong  some  jars).  In 
ill-structured  problems,  goals  are  often  underdetermined,  with  several 
alternative  ways  in  which  they  might  be  satisfied.  Examples  are  frequently 
cited  from  art  or  science,  such  as  the  goal  to  compose  a  fugue,  or  to 
design  an  interesting  experiment.  In  school  geometry,  the  goals  of 
problems  are  usually  well  specified,  but  a  subgoal  that  arises  in  many 
problems  functions  as  an  indefinite  goal  for  experienced  problem  solvers. 
This  is  the  goal  of  proving  that  two  triangles  ire  congruent.  There  are 
several  ways  in  which  congruence  of  triangles  can  be  proved ,  involving 
different  patterns  of  congruent  components  such  is  side-side-side, 
s ide-ang le-s ide ,  and  so  on.  Beginning  learners  treat  these  as  definite 
subgoals,  trying  one  after  another  until  one  is  found  that  works  (Anderson, 
Greeno,  Kline  &  Neves ,  1981 ) .  However,  more  experienced  students  do  not 
mention  specific  patterns  in  their  protocols,  and  appear  to  engage  in 
relatively  diffuse  search  for  congruent  components  if  triangles  with  a  kind 
of  monitor  that  identities  whatever  pattern  of  congruent  components  happens 
to  emerge.  Greeno  (1976)  hypothesized  that  experienced  students  acquire  an 
integrated  structure  of  knowledge  in  the  form  of  a  pattern-recognizing 
system  that  represents  the  goal  of  proving  that  triangles  are  congruent.  A 
version  of  this  that  was  implemented  in  Perdix  is  shown  in  Figure  6. 


Figure  6  here 


Acquisition  of  Problem-Solving  Skill .  An  important  question  is  how 
the  knowledge  required  for  solving  problems  in  a  domain  such  as  geometry  is 
acquired.  We  discuss  studies  of  learning  involving  the  three  kinds  of 
knowledge  for  problem  solving:  problem-sol ving  operators,  perceptual 
concepts  for  pattern  recognition,  and  strategic  knowledge. 

Processes  of  acquiring  problem-solving  operators  were  analyzed  by 
Anderson  (1982),  based  on  observar  ms  of  three  students  as  they  studied 
and  worked  problems  in  the  early  sections  of  a  geometry  text.  .Anderson 
simulated  processes  of  acquiring  problem-solving  skill  in  a  version  of  his 
ACT  model  (cf.  Anderson,  1983). 

A  major  aspect  of  Anderson's  model  is  a  process  that  acquires 
cognitive  procedures  from  declarative  information.  ACT  learns  new 
procedures  by  working  on  problems.  ’Then  ACT  encounters  a  problem  for  which 
it  has  not  learned  a  procedure,  it  uses  general  problem-solving  methods 
along  with  Information  that  is  available  is  it  is  in  a  text.  For  example, 

a  geometry  problem  may  require  finding  a  theorem  that  can  justify  i  step  in 

a  proof.  ACT  has  a  general  procedure  for  seir.-hi.ug  in  a  list  of  theorems 
and  matching  features  of  theorems  to  the  information  in  a  problem.  When  an 

applicable  theorem  is  found,  ACT  asserts  thit  theorem  to  solve  that  part  of 

the  problem. 

ACT  has  a  learning  process  called  procedural iza t ion ,  which  forms  new 
production  rules  that  are  added  to  ACT's  procedural  knowledge.  A  new 
production  can  be  formed  when  a  theorem  has  been  found  and  applied 
successfully  In  problem  solving.  The  new  production  has  conditions 
corresponding  to  selected  features  in  the  problem  situation,  and  an  action 
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of  asserting  the  theorem.  The  production  is  a  new  problem-solving 
operator;  ACT  has  acquired  a  new  ability  to  assert  a  theorem  in 
appropriate  conditions,  without  having  to  search  in  the  list  of  theorems  in 
the  text.  It  has  learned  the  theorem,  not  in  the  sense  of  having  memorised 
it,  but  in  the  sense  of  being  able  to  recognize  when  it  is  applicable  and 
to  apply  it. 

Acquisition  of  perceptual  concepts  for  pattern  recognition  in  problem 
solving  was  studied  by  Simon  and  Gilmartin  (1973)  in  the  domain  of  chess. 
The  learning  mechanism  used  was  adapted  from  the  EPAM  model  (Feigenbaum, 
1963),  which  simulates  acquisition  of  discrimination  networks  like  Figure 
4.  Simon  and  Gilmartin  developed  an  EPAM-type  model  that  acquired 
knowledge  of  patterns  of  chess  pieces  from  presentations  of  board 
positions.  This  knowledge  was  used  to  simulate  performance  in  a  task  of 
reconstructing  positions  after  brief  presentations ,  a  task  known  to 
dif ferent  iate  players  according  to  their  level  of  skill  (deGroot,  1965; 
Chase  &  Simon,  1973;  also  see  Section  III. 3. 2). 

Acquisition  of  strategic  knowledge  for  solving  problems  has  been 
studied  empirically  by  Schoenfeld  (1979).  Four  students  in  upper-division 
college  mathematics  courses  were  given  special  instruction  in  the  use  of 
five  heuristic  strategies  for  working  on  problems:  drawing  a  diagram, 
arguing  by  induction,  arguing  by  contradic ition  or  contrapositive, 
considering  a  simpler  problem  with  fewer  variables,  and  establishing 
subgoals.  Each  strategy  was  presented  in  a  training  session,  lasting  about 
one  hour,  including  explanation  of  conditions  In  which  the  strategy  is 
useful  as  well  as  practice  in  using  the  strategy.  Students  took  a  pretest 
and  a  posttest  with  problems  not  included  in  the  training.  Students  who 
received  the  special  training  had  a  list  of  the  strategies  available  during 
the  posttest  and  were  reminded  from  time  to  time  to  try  to  use  one  of  the 
strategies  if  they  were  not  progressing  well  on  a  problem.  Performance  of 
these  students  was  superior  to  t'nt  of  another  group  of  students  who  had 
worked  on  the  same  training  problems  as  the  instructed  group,  but  without 
explanation  of  the  strategies.  Thinking-aloud  protocols  confirmed  that 
students  considered  and  used  strategies  that  they  had  been  trained  to  use. 
The  training  was  especially  effective  with  strategies  that  have  clear  cues 
for  their  application:  the  fewer-var iables  strategy,  cued  by  the  presence 
of  many  variables,  and  arguing  by  induction,  cued  by  an  integer  argument. 

Processes  of  acquiring  strategic  knowledge  have  been  addressed  in 
theoretical  analyses  by  Anzai  and  Simon  (1979)  and  by  Anderson,  Farrell, 
and  Sauers  (1982).  Anzai  and  Simon  observed  and  simulated  acquisition  of  a 
strategic  concept  in  the  Tower  of  Hanoi  puzzle.  The  concept  involves 
movement  of  a  set  of  disks  requiring  a  sequence  of  individual  moves,  with 
the  sequence  considered  as  a  global  action.  Anderson  et  al.  simulated 
acquisition  of  knowledge  for  applying  techniques  in  learning  to  program  in 
LISP.  In  both  of  these  theoretical  analyses,  important  factors  in 
acquiring  strategic  knowledge  are  activation  of  a  problem  goal  that  can  be 
achieved  by  a  sequence  of  actions  and  acquisition  of  a  production  in  which 
the  action  of  setting  that  goal  is  associated  with  appropriate  conditions 
in  the  problem  situation. 
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II. B. 2.  Einstellung  (Set)  The  context  in  which  problem  solving  occurs 
may  have  an  important  influence  on  the  process.  As  a  consequence  of 
previous  tasks  in  which  a  subject  has  been  engaged  or  previous  stimuli  that 
have  been  presented,  certain  responses  may  become  more  readily  and  speedily 
available  and  others  less  readily  available.  The  subject  has  acquired  a 
"set"  for  the  familiar  stimuli  and  responses. 

One  experimental  design  that  has  been  used  often  to  demonstrate  the 
effects  of  set  is  Co  present  subjects  with  a  sequence  of  tasks  that  induce 
sec,  then  a  new  sequence  of  tasks  in  which  the  set  that  has  been  Induced 
either  facilitates  or  impedes  performance  in  comparison  with  control 
subjects  who  were  not  exposed  to  the  first  sequence.  Luchins  (1942) 
conducted  a  well-known  set  of  experiments  using  this  design,  with  water-jar 
tasks . 


In  Luchins'  version  of  the  water-jar  task,  subjects  must  measure  out  a 
specified  amount  of  water,  using  a  given  set  of  ungraduated  measuring  jars. 
A  source  of  water  is  assumed  to  be  available,  so  that  any  of  the  jars  can 
be  filled  to  its  capacity  if  the  subject  chooses  to  do  that.  In  addition, 
water  can  be  poured  from  one  jug  to  another,  until  the  target  jar  is  filled 
or  the  source  jar  is  empty,  and  the  contents  of  a  jar  can  be  discarded. 

Table  7  here 


The  series  of  problems  that  Luchins  used  Is  in  Table  7.  Here,  all  the 
problems  except  the  first  and  the  ninth  can  be  solved  by  filling  jar  3, 
then  pouring  from  it  to  fill  A,  and  then  filling  C  twice  (X  -  B  -  A  -  2C). 
3ut  Problem  5  and  Problems  7  through  11  can  also  he  solved  using  only  jars 
A  and  C  —  by  either  adding  the  contents  of  C  to  the  contents  of  A,  or 
subtracting  the  contents  of  G  from  A,  and  for  Problem  9,  the  B  -  A  -  2C 
procedure  does  not  work. 

Subjects  given  Problems  7  through  11  immediately  after  solving  Problem 
1  generally  use  the  two-jar  procedure  just  described.  Subjects  who  are 
first  given  Problems  1  thorugh  6  generally  use  the  B  -  A  -  2C  procedure, 
which  is  more  complex  than  necessary  for  Problems  7  through  11,  and  they 
have  considerable  difficulty  with  Problem  9. 

Set  effects  can  be  the  result  of  several  cognitive  processes;  we  will 
discuss  three  that  have  been  put  forward. 

First,  set  may  be  the  result  of  a  bias  In  retrieving  knowledge 
structures  from  memory.  A  standard  assumption  is  that  the  alternative 
concepts  or  cognitive  procedures  that  might  be  retrieved  have  varying 
strengths  or  levels  of  activation  which  determine  the  probabilities  of 
their  retrieval.  If  a  cognitive  unit  has  been  used  successfully  several 
times  in  the  immediate  past,  this  results  in  a  relatively  high  level  of 
activation  fo-  it  unit. 


Figure  7  here 


Table  7 
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Given:  M  is  the  midpoint  of  AB  and  £5; 
ACss  8D. 

Prove:  LAMC  a  £BMD 


Figure  7,  An  Einstellung  problem  in  geometry. 
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Schemata  used  in  planning  provide  one  'Kind  of  structure  that  can 
account  for  set.  An  example  is  In  the  iomain  of  geometry,  where  Greeno,  et 
il.  (1979)  developed  a  simulation  model  w.th  planning  schemata,  described 
in  Section  II.B.l.  Luchins  (1942)  included  a  study  of  geometry  problem 
solving  in  his  investigations  of  Finstellung.  Figure  7  shows  the  kind  of 
problem  used  as  a  test.  The  proof  can  be  obtained  in  one  step;  /AMC  and 
^BMD  are  vertical  angles.  However,  if  subje:ts  were  first  given  a  series 
of  problems  where  they  used  congruent  triangles  in  proofs,  they  were  likely 
to  construct  the  more  complex  proof  for  Figure  7  in  which  triangles  AMC  and 
BMD  are  proved  congruent  by  Stde-Side-Side .  An  explanation  is  provided  If 
we  assume  that  students  have  a  schema  corresponding  to  the  plan  of  using 
congruent  triangles  for  a  proof,  and  that  this  schema  has  a  high  level  of 
activation  because  of  its  use  in  the  initial  series  of  problems.  Greeno  et 
al.  (  1979)  reported  an  experiment  with  i  t^st  problem  that  could  be  •=  i  ved 
either  using  congruent  triangles  or  angles  formed  by  parallel  lines, 
either  method  required  construction  of  an  auxiliary  line.  Subjects  wt  ± 
given  series  of  problems  before  the  test  problem  involving  either  congruent 
triangles  or  parallel  lines,  and  were  strongly  biased  toward  solutions  of 
the  same  type  they  had  been  giving. 

Set  based  on  activation  may  either  facilitate  task  performance  or 
Impede  It,  deper Ung  on  whether  the  memory  elements  that  are  activated 
contaLn  the  information  that  is  needed  for  performance.  Sweller  and  Gee 
(1978)  showed  that  the  tendency  to  use  a  previously  successful  rule  can 
greatly  facilitate  solution  of  a  relatively  complex  problem,  presumably  by 
eliminating  the  need  to  search  in  a  large  space  of  possibilities,  even 
though  in  the  same  situation  it  prevents  subjects  from  noticing  a  simpler 
solution  method.  Such  situations  are  common,  since  set  Is  bound  to  arise 
wherever  memory  organization  is  not  neutral  with  respect  to  the 
problem-solving  process  —  that  is,  wherever  there  are  alternative  ways  of 
storing  information  in  memory,  one  of  which  may  be  more  conducive  to 
retrieval  in  a  given  problem  context  than  another. 

A  second  possible  explanation  of  Einstellung  is  provided  by 
composition  of  productions,  investigated  first  by  Lewis  (1978). 

Composition  is  a  process  in  which  a  newly  acquired  production  performs 
actions  that  required  two  or  more  productions  in  the  previous  knowledge 
structure.  Composition  generally  makes  performance  more  efficient  by 
providing  a  way  to  act  directly  rather  than  requiring  several  steps  to 
achieve  a  goal.  The  new  productions  created  by  composition  usually  have 
conditions  that  are  relatively  specific,  and  In  some  production  systems 
(Including  ACT)  this  leads  to  their  being  preferred  to  productions  with 
less  specific  conditions.  Anderson  (1982)  noted  that  this  would  simulate 
the  performance  observed  by  Luchins  (1942)  on  problems  like  Figure  7. 

Third,  some  set-like  phenomena  could  also  be  produced  by  the  basic 
problem-solving  procedure  that  a  subject  uses.  We  have  already  noted  that 
subjects  very  frequently  use  the  heuristic  of  means-ends  analysis  —  that 
is,  comparing  situation  with  goal  and  taking  an  action  that  seems  to  reduce 
the  difference  between  them.  In  their  analysis  of  behavior  of  subjects 
solving  water- jar  problems,  Atwood  and  Poison  (1976)  showed  that  where 
alternative  actions  could  be  taken,  most  subjects  selected  the  one  that  led 
to  a  situation  that  was  most  like  the  goal  situation.  Like  the  more 
specific  sets  Induced  by  Luchins'  manipulation,  this  general  set  to  pick 
paths  that  lead  toward  the  desired  goal  can  sometimes  Interfere  with 
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problem  solution.  Where  memory  limitations  prevent  subjects  from  looking 
far  ahead,  this  goal-oriented  strategy  may  sometimes  produce  a  nyopic 
preoccupation  with  immediate  progress,  and  an  avoidance  of  paths  that  lead 
to  the  goal  only  indirectly.  Jeffries,  Poison,  Razran,  and  Atwood  (1977) 
showed  that,  without  look-ahead,  subjects  solving  the  Missionaries  and 
Cannibals  puzzle  would  have  difficulty  (as,  in  fact,  they  do)  on  the  step 
where  they  were  required  to  bring  two  persons  back  from  the  farther  bank  of 
the  river  to  which  they  were  trying  ultimately  to  transport  all  of  them. 
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II. C.  Problems  with  Specified  Procedures 

In  the  tasks  discussed  in  II. A  and  II. B,  a  definite  goal  is  presented 
to  the  problem  solver.  In  this  section  we  discuss  tasks  in  which  the 
problem  presents  material  for  a  procedure,  and  the  task  is  to  apply  the 
procedure  to  find  the  result  that  is  obtained.  While  the  tasks  discussed 
In  II. A  and  II. B  specify  a  goal  and  require  a  method  to  get  there,  the 
tasks  we  discuss  now  specify  a  method  and  ask  where  the  method  leads. 

The  tasks  that  we  discuss  come  from  arithmetic.  Many  tasks  of 
applying  procedures  occur  In  mathematics ,  for  example,  finding  a  derivative 
in  calculus  or  finding  the  product  of  two  expressions  in  algebra.  Some 
people  would  object  that  such  tasks  do  not  involve  problem  solving,  since 
they  require  knowledge  of  a  procedure  rather  than  search  in  a  space  of 
possible  solutions.  On  the  other  hand,  these  tasks  are  considered  as 
problems  by  students  who  receive  them  as  homework  assignments  (and 
presumably  by  teachers  who  assign  them). 

More  significantly,  the  knowledge  required  for  these  procedure-based 
tasks  is  similar  to  the  knowledge  that  individuals  acquire  when  they  learn 
to  solve  problems  that  do  not  specify  solution  methods,  such  as  geometry 
proof  exercises  or  water  jar  problems.  Knowledge  for  planning  in  geometry 
constitutes  a  set  of  procedures  that  the  student  has  acquired  for  solving 
various  kinds  of  problems.  Use  of  these  procedures  requires  recognition  of 
their  applicability,  which  is  not  required  if  the  problem  says  "subtract" 
or  "differentiate;"  however,  characteristics  of  the  procedural  knowledge 
that  have  been  Identified  by  theoretical  analyses  of  the  various  tasks  are 
more  notable  for  their  similarities  than  for  their  differences. 

Our  discussion  in  this  section  is  focused  on  empirical  methods  that 
have  been  used  to  infer  the  nature  of  procedural  knowledge.  First,  we 
discuss  inferences  based  on  patterns  of  errors  that  occur  in  elementary 
arithmetic.  Then  we  discuss  inferences  from  latency  data. 

II .C. 1 .  Diagnosis  of  Cognitive  Procedures  from  Patterns  of  Errors . 
Brown  and  Burton  (1980)  analyzed  children's  knowledge  for  subtraction 
problems  with  multidigit  numbers.  Their  data  were  obtained  in  an 
arithmetic  achievement  test  taken  by  1325  school  chLldren.  Ordinarily, 
performance  on  tests  is  used  to  assign  a  simple  score  for  each  student, 
allowing  judgments  of  which  students  have  learned  a  satisfactory  amount. 
Brown  and  Burton's  analysis  shows  that  test  data  are  potentially  much 
richer,  and  can  be  used  to  make  stronger  Inferences  about  the  nature  of 
children's  knowledge. 

The  more  powerful  theoretical  use  of  test  data  depends  on  two  tilings. 
First,  performance  on  the  test  is  not  characterized  simply  by  the  number  of 
problems  correct,  but  by  the  specific  answers  given  to  -ill  the  problems, 
with  particular  attention  to  the  incorrect  answers. *3  Second,  the  analysis 
of  each  student's  test  performance  consists  of  a  model  of  a  procedure  for 
solving  the  probLems. 
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Table  8  here 


An  example  of  an  individual's  performance  is  in  Table  8.  Table  8 
contains  six  errors  (the  fourth  problem  in  the  second  row,  and  all  the 
problems  in  the  third  row),  not  a  very  good  score.  However,  all  but  one  of 
the  errors  were  apparently  caused  by  a  single  flaw  in  the  student's 
procedure.  When  the  student  had  to  borrow  and  encountered  a  zero,  the 
student  replaced  the  zero  by  a  nine,  but  did  not  go  further  and  decrement 
other  digits  in  the  top  number. 

Brown  and  Burton  developed  a  general  model  of  subtraction  for  which 
various  flawed  versions  can  be  represented  as  variants.  The  desired 
outcome  is  that  the  performance  of  each  Individual  child,  such  as  that 
shown  in  Table  <8,  should  correspond  as  closely  as  possible  to  one  of  the 
variants  of  the  general  model. 

The  general  model  has  the  form  of  a  procedural  network.,  the  formalism 
developed  by  Sacerdoti  (1977)  and  used  by  Greeno  et  al.  (1979)  to  explain 
constructions  and  set  in  geometry  problem  solving.  The  main  features  of  a 
procedural  network  are  that  units  of  knowledge  correspond  to  actions  at 
differing  levels  of  generality,  and  each  action  unit  includes  information 
about  conditions  for  performing  the  action  and  the  action's  consequences. 


Figure  8  here 


Figure  8  shows  the  action  components  in  Brown  and  Burton's  procedural 
network  for  subtraction.  The  diagram  shows  component  procedures  and  their 
subprocedures,  but  does  not  show  any  of  the  control  information  that  is 
also  required.  For  example,  the  diagram  includes  a  procedure 
Sub tract -Column ,  and  three  subprocedures,  Borrow-Needed ,  Do-Borrow,  and 
Complete-Column ,  which  can  be  called  from  Subtract-Column.  Control 
knowledge  involving  these  subprocedures  includes  the  information  that 
Borrow-Needed  is  a  test  that  determines  whether  it  is  necessary  to  borrow 
before  finding  the  difference  in  the  column,  and  the  outcome  of  that  test 
determines  whether  Do-Borrow  will  be  called. 

Brown  and  Burton  formulated  models  of  faulty  performance  by  varying 
components  of  the  procedural  network  for  correct  subtraction.  For  example, 


*3.  The  idea  of  using  patterns  of  errors  to  infer  underlying 
psychological  processes  is  not  new,  either  in  the  psychological  or  the 
educational  literature.  Earlier  psychological  models  were  simpler,  and  the 
inferencers  about  processes  were  correspondingly  less  powerful;  an  example 
Is  Poison,  Restle  and  Poison's  (1965)  use  of  errors  to  identify  a  stage  of 
learning  in  which  similar  stimuli  have  not  yet  been  d Lsc rirainated .  In  the 
educational  literature  more  complex  psychological  distinctions  have  been 
made,  for  example  by  3rownel!  (1941),  but  in  that  work,  analyses  of 
underlying  psychological  processes  was  informal,  consisting  of  verbal 
descriptions  of  procedures  hypothesized  to  produce  observed  error  patterns, 
and  as  Brown  and  Burton  documented,  verbal  descriptions  of  procedures  turn 
out  to  be  ambiguous  in  important  ways. 


r<ibi.t2  d 


One  Student's  Performance  on  Subtraction  Problems 
(from  Brown  &  Burton,  19/6; 
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the  flaw  of  borrowing  from  zero  is  modeled  by  removing  some  of  the  control 
processing  from  the  procedure  Sorrow-Ten  in  the  Do-Borrow  subprocedure . 

The  change  involves  removing  the  decision  to  Find-Next-Column  if  a  zero  is 

found,  resulting  in  a  procedure  that  just  changes  zero  to  nine  and  adds  ten 

to  the  original  column. 

Tie  family  of  models  that  Brown  and  Burton  arrived  at  included  hO 
procedural  flaws  of  the  kind  described  above.  These  SO  "bugs"  provide 
explanat  ions  for  many  of  the  patterns  of  performance  Found  in  the  test 
data,  and  more  students'  performance  is  explained  if  combinations  of 
elementary  bugs  are  included  in  the  analysis.  About  4071  of  the  students' 
error  patterns  were  explained  reasonably  well  by  single  bugs  or 
combinations  of  two  elementary  bugs.  In  examining  additional  sets  of  data, 
more  elementary  bugs  have  been  identified  (115  elementary  bugs  are  now  in 
the  data  base),  and  adequate  explanations  are  typically  provided  for  about 

40*  of  students  who  make  errors  (VanLehn,  1982). 

An  alternative  analysis  of  subtraction  errors  was  provided  by  Young 
and  O'Shea  (1981),  who  developed  a  relatively  simple  production  system  that 
simulates  correct  subtraction  performance,  and  by  deleting  individual 
productions,  simulates  faulty  performance.  Young  and  O'Shea's  analysis 
p-avides  explanations  for  about  the  same  proportion  of  students  as  Brown 
and  Burton's.  On  the  other  hand,  it  provides  explanations  for  only  i  small 
proportion  of  the  patterns  of  performance  that  have  been  observed.  While 
many  patterns  occur  rarely,  their  existence  provides  evidence  for  a 
relatively  complex  generative  system. 

Another  significant  development  has  been  an  effort  by  Brown  and 
VanLehn  (1980)  and  VanLehn  (1983)  to  formulate  a  system  that  explains  the 
production  of  "buggy"  procedures.  These  formulations  distinguish  between  a 
cognitive  structure  of  partial  knowledge  of  subtraction,  and  a  "fallback" 
process  of  problem  solving  that  is  used  when  a  situation  is  encountered  for 
which  the  partial  knowledge  is  not  adequate.  In  VanLehn' s  (1983)  version, 
the  underlying  cognitive  structures,  core  procedures,  result  from  a 
combination  of  partial  learning  and  deletion  of  components  of  procedural 
knowledge.  A  core  procedure  might,  for  example,  lack  a  component  for 
dealing  with  a  zero  during  borrowing.  When  such  an  impasse  occurs  it  is 
assumed  that  the  problem  solver  applies  a  general  problem-solving  method  to 
be  able  to  continue.  Methods  assumed  to  be  available  include  skipping  an 
operation,  applying  the  operation  to  a  different  problem  element,  and  using 
an  alternative  operation  that  is  applicable  in  a  similar  problem  situation. 
One  form  of  evidence  that  supports  the  theory  comes  from  data  obtained  by 
giving  students  repeated  tests.  In  a  substantial  number  of  cases,  students 
perform  differently  in  two  tests  separated  by  two  or  three  days,  hut  the 
performance  can  be  explained  by  assuming  a  single  core  procedure  for  which 
different  problem-solving  methods  have  been  used. 

Van  Lehn  (1983)  conducted  theoretical  investigations  in  which  a  small 
set  of  problem-solving  methods  is  combined  with  a  plausible  set  of  core 
procedures  to  generate  buggy  subtraction  procedures.  The  generative  system 
that  has  been  developed  can  account  for  about  one-half  of  the  buggy 
procedures  that  have  been  observed;  amendments  that  would  increase  the 
theory's  empirical  adequacy  could  be  devised  easily,  but  would  not  have 
strong  theoretical  motivation.  Part  of  the  progress  that  has  been  made 
involves  identifying  some  general  features  of  the  system.  It  can  be 
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argued,  based  on  general  properties  of  bugs,  that  the 
push-down  memory  for  recalling  past  goals,  that  goals 
hierarchically,  and  that  the  representation  of  a  goal 
components  to  which  the  goal  applies. 


system  has  a 
are  organized 
includes  the  problem 


Another  line  of  analysis  that  has  developed  from  the  study  of 
subtraction  bugs  involves  analysis  of  cognitive  structures  for 
understanding  general  arithmetic  principles  that  underlie  correct 
subtraction  procedures.  We  will  discuss  this  theoretical  development  in 
relation  to  the  topics  of  representation  and  understanding,  in  Section 
II. D. 2. 


II. C. 2.  Inferences  3ased  on  Latencies.  We  now  discuss  an  arithmetic 
task  that  is  even  simpler  than  multidigit  calculation:  answering  basic 
addition  problems  such  as  3+5.  The  main  data  used  in  the  analyses  are 
latencies.  Patterns  of  latencies  of  individual  subjects  are  used  to 
diagnose  their  solution  processes. 

We  focus  on  an  empirical  study  by  Groen  and  Resnick  (1977).  Subjects 
in  the  experiment  were  five  preschool  children  who  knew  how  to  count  and 
could  recognize  the  numerals  1-9,  but  who  did  not  krow  about  addition. 
These  children  were  taught  a  method  for  addition  using  blocks.  The 
procedure  was  to  count  out  two  piles,  each  having  one  of  the  numbers  in  it, 
and  then  count  how  many  were  in  the  two  piles  together.  For  example,  for  3 
+  5,  the  child  could  count  out  a  pile  of  three,  then  a  pile  of  five,  and 
then  count  the  complete  set  to  find  eight  as  the  answer.  In  showing  the 
child  the  method,  the  experimenter  sometimes  started  with  the  number  on  the 
left  of  the  problem,  and  sometimes  with  the  number  on  the  right. 

Tine  problems  used  were  basic  addition  facts  involving  the  digits  1  - 
5,  omitting  5+5.  After  a  child  could  solve  all  24  of  these  problems 
correctly  using  blocks,  a  new  apparatus  was  introduced.  The  blocks  were  no 
longer  provided,  and  the  child  answered  problems  by  pressing  buttons 
labeled  1-9.  Children  were  shown  how  to  count  out  answers  on  their 
fingers  if  this  was  necessary.  Children  received  from  four  to  seven  blocks 
of  problems  with  this  apparatus,  with  about  25  problems  per  block. 

The  latency  data  were  analyzed  using  regression  techniques;  models  of 
cognitive  processes  were  employed  to  determine  the  values  of  Independent 
variables.  Two  models  were  used. 

According  to  one  model,  the  process  of  finding  the  answer  to  each 
problem  was  much  like  the  procedure  that  the  children  were  taught.  In  that 
procedure,  a  number  of  sets  must  be  counted;  In  fact,  the  total  number  of 
counts  equals  twice  the  number  that  is  the  answer.  If  we  assume  that  a 
fairly  uniform  amount  of  time  is  used,  each  time  something  is  counted,  then 
the  total  amount  of  time  needed  is 

T  »  A  +  3(2S)  , 

where  S  is  the  sum  of  the  two  numbers  (i.e.,  the  answer),  and  A  and  B  are 
constants . 

According  to  a  second  model,  the  process  Is  considerably  simpler.  The 
sun  can  al30  be  found  by  starting  with  the  larger  of  the  two  addends  and 
counting  up  the  number  of  the  smaller  addend.  According  to  this  model,  the 
time  it  takes  to  find  the  answer  is 
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T  =  A  +  B(M) , 

where  M  is  the  minimum  addend,  and  again  A  and  3  ire  constants.  These  two 
models  are  called  the  Sum  Model  and  the  Min  Model,  respectively. 

Comparison  of  these  two  models  with  the  data  of  children's  performance 
is  interesting  primarily  for  the  possibility  that  children  spontaneous ly 
change  their  procedure  for  solving  addition  problems.  If  they  use  the 
procedure  they  were  taught,  their  performance  should  agree  with  the  Sum 
Model.  However,  performance  consistent  with  the  Min  Model  would  reflect  a 
more  efficient  procedure,  and  would  indicate  that  children  had 
spontaneously  modified  their  problem-solving  procedures.  It  would  thus 
indicate  a  significant  capability  for  discovery  or  invention. 

To  apply  either  the  Sum  or  the  Mia  Model  to  the  lata,  problems  are 
grouped  according  to  the  number  of  o  mating  operations  they  require. 

Because  the  models  specify  different  counting  operations,  they  imply 
different  groupings  of  items.  For  example,  according  to  the  Sum  Model,  the 
problems  6+1,  5+2,  and  4+3  all  require  the  same  number  of  operations, 
but  these  problems  require  different  numbers  of  counts  according  to  the  Min 
Model.  On  the  other  hand,  the  problems  4+3  and  3+5  require  the  same 
number  of  counts  by  the  Min  Model,  but  are  different  according  to  the  Sum 
Model . 

If  a  model  is  approximately  correct,  the  regression  based  on  that 
model  should  give  accurate  predictions  of  problem  latency.  The  criterion 
of  fit  used  by  Green  and  Resnick  was  the  proportion  ct  variance,  R:, 
accounted  for  by  the  regression.  Higher  values  of  R:  indicate  better 
agreement  between  the  latency  data  and  the  theoretical  function. 


Table  9  here 


Table  9  shows  that  about  one-half  of  the  subjects  were  fit  better  bv 
the  Min  Model  than  by  the  Sura  Model.  Values  of  R*  are  shown  for  latency 
data  from  each  block  of  problems  except  the  first,  in  which  the  children 
were  getting  used  to  the  new  apparatus.  Subjects  2  and  4  were  fit  better 
by  the  Min  Model,  Subject  5  was  fit  better  by  the  Sum  Model ,  and  Subject  l 
underwent  a  transition,  being  fit  better  by  the  Sum  Model  in  Blocks  2-5, 
but  by  the  Min  Model  in  Blocks  6  and  7.  Another  experiment,  in  which 
practice  problems  were  presented  in  a  systematic  order,  had  similar 
results . 

The  important  conclusion  taken  from  these  data  is  that  children  must 
have  discovered  the  procedure  represented  by  the  Min  Model,  since  they  were 
not  taught  how  to  add  in  that  way.  Neches  (1981)  has  developed  an  analysis 
of  learning  mechanisms  that  can  produce  modified  procedures,  and  used  his 
system  to  simulate  changes  in  counting  procedures  for  addition  problems. 

The  main  ideas  in  Neches  '  model  are  that  redundant  components  of  the 
procedure  can  be  removed,  and  when  there  are  alternative  ways  of  reaching 
the  same  result,  the  easier  method  can  be  chosen.  For  example,  In  the  Sum 
procedure,  the  first  addend  is  counted,  then  later  the  process  of  counting 
the  combined  set  includes  counting  the  first  addend  as  a  part.  Noticing 
this  redundancy  leads  to  removal  of  the  initial  count  of  the  first  addend 
from  the  procedure.  Choice  of  the  larger  addend  to  initialise  the 
procedure  can  be  made  if  it  is  noticed  that  the  same  result  obtained 


Table  9.  Results  of  Applying  Regression  Models  to  Laten 
(Groen  &  Resnick,  1977). 
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regardless  of  which  addend  is  used,  and  less  effort  is  required  if  the 
larger  addend  is  chosen.  To  produce  modi f ica t i ons  in  its  procedures, 
Neches's  system  requires  a  trace  of  its  activity,  including  the  goals  that 
are  active  during  the  various  stages  of  its  performance. 

The  regression  method  also  has  be  >n  used  in  inal/zlng  performance  of 
adults  In  simple  arithmetic  tasks.  Groen  an!  Pirknan  (1972)  found  that 
college  students'  performance  Is  quite  consistent  with  the  Min  Model.  The 
slope  of  the  best  fitting  regression  equation  is  far  too  small  to 
correspond  to  verbal  counting,  but  an  analogue  of  a  counting  procedure 
might  be  postulated  to  account  for  the  result. 

More  recent  studies  of  performance  in  mental  arithmetic  have  been 
conducted  by  Ashcraft  and  his  associates.  Gsing  a  task  in  which  subjects 
are  shown  a  problem  with  a  possible  answer  and  are  asked  whether  it  is 
correct,  Ashcraft  and  Battaglia  (1978)  found  an  effect  of  problem  size 
(i.e.,  longer  latency  when  problems  involve  larger  nunbers),  but  tills 
effect  was  not  linear  In  the  smaller  addend,  as  required  by  the  Min  Model. 

A  better  predictor  of  latency  was  the  square  of  the  problem  sum,  an  effect 
that  seems  inconsistent  with  a  simple  process  of  counting.  Ashcraft  and 
Battaglia  also  found  that  latencies  were  faster  for  rejecting  wrong  answers 
that  are  very  different  from  the  correct  answer,  compared  to  wrong  answers 
that  are  near  the  correct  answer.  Another  relevant  finding  by  WLnkelman 
and  Schmidt  (  1974)  is  that  latency  was  slowed  by  *  false  answer  that  would 
be  correct  for  a  different  operation;  for  example,  3X  i  =  7.  As  Ashcraft 
and  Stazyk  (1981)  have  argued,  these  findings  can  be  explained  most  easily 
by  assuming  a  process  of  retrieval  from  memory,  rather  than  a  counting 
procedure,  with  effects  on  latency  that  result  from  the  way  In  which 
information  is  stored  and  from  processes  of  activation  and  search. 


m 


Section  II,  './ell  Specified  Problems  Page  52 

Problem  Understanding;  Represent  ition 


II. 0.  Problem  Understanding;  Representation 

Re  fore  a  problem  can  be  solved,  it  must  be  understood,  'lany  problems 
used  in  education  are  presented  as  natural-language  texts  that  describe 
situations  and  ask  questions,  usually  the  values  of  some  quantities.  In 
laboratory  studies,  problems  often  are  presented  in  the  form  of 
instructions  that  specify  the  goals  and  problem-solving  operators  that  can 
be  used  in  working  on  the  problems.  These  texts  or  instructions  must  be 
interpreted,  and  some  kind  of  representation  of  the  problem  must  be 
generated  before  problem-solving  processes  can  be  put  to  work  in  seeking  a 
solution . 

The  same  problem  may  be  represented  in  radically  different  ways.  This 
is  illustrated  dramatically  by  the  "mutilated  checkerboard"  problem.  We 
are  given  an  ordinary  o  x  3  checkerboard ,  with  alternating  black  and  red 
squares,  and  a  set  of  dominoes,  each  of  which  is  exactly  the  right  size  to 
cover  two  squares.  The  entire  board  can  be  covered  neatly  by  32  dominoes, 
with  no  square  uncovered,  and  no  domino  hanging  over  the  edge  of  the  board. 
Suppose  now  that  the  north-east  square  and  the  south-west  square  of  the 
checkerboard  are  cut  off,  leaving  62  squares.  Can  the  mutilated  board  now 
be  covered  neatly  by  31  dominoes? 

It  is  impossible  for  a  human  being  or  a  computer  to  answer  this 
question  by  exhaustive  search  in  the  obvious  but  enormous  problem  space  In 
which  the  squares  and  dominoes  are  represented  directly.  Consider, 
however,  an  abstract  problem  space  in  which  we  represent  only  the  number  of 
dominoes  that  have  been  laid  down,  and  the  numbers  of  black  squares  and  of 
red  squares  that  remain  uncovered.  At  the  outset,  because  of  the 
mutilation,  there  are  32  red  squares,  but  only  30  black  squares  (or  vice 
versa).  Each  domino  covers  exactly  one  red  and  one  black  square.  Hence, 
no  matter  how  the  dominoes  are  placed  on  the  board,  after  30  have  been 
placed,  if  that  is  possible,  two  red  squares  and  no  black  squares  will 
remain  uncovered.  But  the  final  domino  cannot  cover  two  red  squares,  hence 
there  is  no  way  to  complete  the  covering.  Here,  a  change  in  problem 
representation  changes  the  problem  from  one  that  is  practically  unsolvable 
to  one  that  is  solvable  relatively  easily. 


Figure  9  here 


Another  famous  example  of  problem  understanding,  discussed  by 
Wertheimer  (1945/1959),  arises  in  finding  the  area  of  a  parallelogram. 
Students  are  taught  that  the  area  of  a  parallelogram  can  be  calculated  with 
a  formula  A  =  b  x  h,  where  b  and  h  are  the  base  3nd  height,  respectively. 
Wert  eimer  described  two  ways  in  which  the  formula  may  be  understood.  In 
one  representation,  b  is  the  length  of  a  horizontal  side  of  the 
parallelogram,  and  h  Is  the  length  of  a  vertical  line  drawn  from  a  corner 
at  the  top  of  the  figure  to  its  base,  as  shown  on  the  left  of  Figure  9. 

Many  students,  apparently  using  that  representation,  become  confused  if 
they  are  then  asked  to  find  the  area  of  a  parallelogram  oriented 
differently,  as  In  the  right  side  of  Figure  9.  Another  way  to  understand 
the  formula  includes  a  relationship  between  parallelograms  and  rectangles. 

A  parallelogram  can  be  transformed  into  a  rectangle  by  removing  a 
triangular  piece  from  one  end  and  attaching  it  to  the  other  end.  Then  b 
and  h  are  equal  to  the  length  and  width,  respectively,  of  the  rectangle 
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that  the  parallelogram  can  be  transformed  into,  children  who  understand 
the  parallelogram  problem  in  this  way  have  no  difficulty  in  solving 
problems  where  the  figure  is  oriented  differently,  and  frequently  can 
transfer  their  knowledge  to  solve  more  complex  problems,  such  as  finding 
the  area  of  a  trapezoid.  The  two  representat Lons  involve  different 
features  of  specific  problems,  one  with  b  and  h  identified  with  specific 
locations  in  the  figure,  and  the  other  with  b  and  h  defined  in  more  general 
terms . 

In  Section  II.D.l,  we  discuss  studies  of  problem  understanding, 
involving  instructions  for  novel  problems  and  text  problems  in  domains 
where  the  problem  solver  already  has  learned  the  problem-solving  operators. 
In  Section  II. D. 2,  we  discuss  understanding  of  the  structures  of  problems 
and  problem  solutions,  in  contrast  to  mere  rote  or  mechanical  knowledge  of 
problem-solving  procedures,  the  Issue  emphasized  by  Wertheimer  and  other 
Gestalt  psychologists. 

II.D.l.  Understanding  Problem  Instruct  Lons .  In  most  studies, 
consideration  of  subjects'  behaviors  in  problem-solving  tasks  is  begun 
after  the  subjects  have  received  the  problem  Instructions,  including  the 
definition  of  the  problem,  and  have  been  tested  by  the  experimenter  for 
their  understanding  of  the  problem.  In  a  few  cases  that  we  discuss  here, 
the  processes  studied  are  those  required  for  assimilating  the  problem  prior 
to  making  attempts  to  solve  it. 

In  the  situations  that  have  been  studied,  solution  of  the  problem  Is 
likely  to  proceed  by  a  form  of  means-ends  analysis.  Therefore,  the 
information  that  subjects  extract  from  instructions  is  probably  similar  to 
the  information  needed  by  the  General  Problem  Solver.  When  GPS  is  given  a 
problem,  it  is  provided  with  a  list  of  the  objects  with  which  the  problem 
Is  concerned,  the  relevant  properties  of  these  objects,  operators  for  legal 
moves,  a  description  of  the  starting  situation,  and  a  set  of  tests  to 
determine  when  the  final  goal  has  been  reached.  It  must  either  be  provided 
with,  or  acquire  by  learning,  a  set  of  tests  for  differences  between 
situations,  and  a  set  of  productions  that  evoke,  when  particular 
differences  are  present,  operators  that  are  relevant  for  reducing  these 
differences . 

For  example,  In  the  Tower  of  Hanoi  problem,  the  objects  are  disks  (N 
in  number)  and  pegs  (3).  A  legal  move  consists  in  transfering  the  smallest 
dl3k  on  some  peg  to  another  peg  that  holds  no  smaller  disk.  Hence,  the 
size  of  a  disk  is  its  relevant  property.  Situations  differ  with  respect  to 
which  disks  are  on  a  particular  peg,  or  with  respect  to  the  peg  on  which  a 
particular  disk  Is  located.  In  the  starting  situation,  all  the  disks  are 
held,  say,  on  a  single  peg;  the  goal  is  to  move  the  entire  set  of  disks  to 
some  particular  other  peg.  The  problem  description  must  provide  all  of 
this  information,  In  English,  and  the  subject  (or  computer  program)  must 
convert  this  English  prose  into  an  internal  representation  that  permits 
situations  and  moves  and  their  consequences  to  be  modeled.  A  disk,  for 
instance,  may  be  represented  as  a  schema,  one  of  whose  attributes  is  its 
size;  a  peg  by  a  schema,  one  of  whose  attributes  is  the  list  of  disks 
currently  on  that  peg.  A  move  operator  is  a  process  that  changes  a  pair  of 
the  latter  lists,  by  moving  the  name  of  a  particular  disk  from  the  one  list 
to  the  other. 
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Two  central  problems  for  psychological  research  on  the  understanding 
of  problem  instructions  are:  (1)  how  the  verbal  instructions  are  converted 
to  an  internal  representation;  and  (2)  what  characteristics  of  the 
Instructions  cause  the  problem  to  be  represented  in  one  way,  rather  than 
ocher  possible  ways.  The  second  question  is  especially  important  when 
alternative  representations  differ  in  the  difficulty  of  solving  the 
problem,  as  with  the  mutilated  checkerboard  example,  or  provide  differing 
degrees  of  generality,  as  with  the  parallelogram  problem.  The  questions 
have  been  addressed  by  Hayes  and  Simon  (e.g.,  1974),  who  obtained 
information  about  internal  representations  by  collecting  extensive  verbal 
protocols  of  problem-understanding  processes.  By  using  problems  where 
alternative  representations  were  available,  Hayes  and  Simon  have  also  -;ast 
light  on  the  question  of  which  representation  will  be  formed. 

The  Understand  program  (Hayes  *  Simon,  1974)  is  i  computer  simulation 
of  the  problem  understanding  process  for  puzzle-like  problems  like  the 
Tower  of  Hanoi  or  the  Missionaries  and  Cannibals  problem  --  that  is,  for 
problems  that  do  not  assume  the  subject  has  any  prior  knowledge  of  the 
problem  domain.  The  program  matches  human  thinking-aloud  protocols 
sufficiently  well  to  lay  claims  of  being  a  good  first-approximation  nodel 
of  the  process. 

Understand  operates  in  two  principal  phases.  In  the  first  phase,  a 
language-parsing  program  extracts  the  deep  structure  from  the  language  of 
the  Instructions.  In  the  second  phase,  another  set  of  processes  constructs 
from  this  information  a  problem  representation  that  is  suitable  as  input  to 
a  GPS-like  problem-solving  program.  This  is  accomplished  by  (a) 
identifying  the  objects  and  sets  of  objects  that  are  mentioned  In  the 
parsed  text,  (b)  identifying  the  descriptors  of  those  objects  and  the 
relations  among  them,  (c)  Identifying  the  descriptions  of  legal  moves  and 
constructing  move  operators  that  fit  those  iesc rlpt tons ,  (d)  identifying 
the  description  of  the  solution  and  constructing  a  test  for  attainment  of 
the  solution,  and  (e)  constructing  an  organization  of  schemata  that 
describes  the  initial  problem  situation. 

For  example,  after  parsing  the  written  description  of  the  Tower  of 
Hanoi  problem,  Understand  would  identify  pegs  and  disks  as  the  relevant 
sets  of  objects,  and  would  notice  that  disks  are  £n  pegs  and  that  disks 
move  from  one  peg  to  another.  It  would  extract  the  information  that  only 
the  smallest  disk  on  a  peg  may  be  moved,  and  only  to  a  peg  where  there  is 
no  smaller  disk,  and  would  construct  a  test  process  for  checking  these 
conditions.  It  would  determine  that  the  problem  is  solved  when  all  the 
disks  are  on  (say)  the  third  peg,  and  would  construct  a  test  to  determine 
when  that  condition  is  satisfied.  Finally,  it  would  generate  a  list 
structure  showing  all  the  disks  initially  as  being  on  the  first  peg.  From 
the  evidence  of  protocols,  and  of  subjects'  subsequent  problem-solving 
behavior,  this  is  what  human  solvers  do  also. 

Problem  tsomorphs .  A  powerful  experimental  manipulation  for  studying 
problem  understanding  is  to  use  variant  problem  instructions  all  of  which 
describe  isomorphs  of  a  single  problem.  Two  problems  are  isomorphs  if  the 
legal  problem  situations  and  legal  moves  of  the  one  can  be  mapped  In 
one-to-one  fashion  on  the  situations  and  moves  of  the  other.  Then,  if 
situation  S'  is  the  isomorph  of  S  and  moves  A',  B' ,  etc.,  are  the  isomorphs 
of  A,  B,  etc.,  and  if  the  succession  of  moves  A,  B,...  takes  the  one 
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system  from  S  to  T,  then  the  succession  of  moves  A',  B',...  will  take  the 
other  system  from  S'  to  T' ,  where  T'  is  the  isomorph  of  T. 

Using  a  number  of  isomorphs  of  the  Tower  of  Hanoi  problem,  Hayes  and 
Simon  (1977)  demonstrated  that  problem  difficulty  varied  by  a  factor  of  two 
to  one  from  one  class  of  problem  descriptions  (transfer  problems)  to 
another  class  (change  problems).  Moreover,  protocols  and  diagrams  produced 
by  subjects  showed  that  they  were  consistently  using  different 
representations  for  the  different  classes  of  isomorphic  problems.  The 
Understand  program  behaved  in  the  same  way,  constructing  different 
representations  for  the  transfer  and  change  problems,  respectively.  In 
only  one  case  out  of  the  nearly  100  that  have  been  examined  did  a  subject 
shift  from  the  more  difficult  "change"  representation  to  the  easier 
"transfer"  representation. 

The  reasons  why  the  change  problems  take  twice  as  long  to  solve  as  the 
isomorphic  transfer  problems  have  not  been  fully  elucidated.  It  can  be 
shown,  however,  that  the  tests  for  move  legality  are  a  little  more  complex 
for  the  former  than  for  the  latter,  and  this  additional  complexity  may 
increase  the  short-term  memory  load  on  the  subject  who  is  seeking  to 
understand  the  problem  Instructions. 

Problem  isomorphs  can  be  used  to  study  transfer  of  -raining,  and  such 
a  study  was  conducted  by  Reed,  Ernst,  and  Banerji  (1974).  They  devised  a 
variant  of  the  missionaries  and  cannibals  problem,  called  the  jealous 
husbands  problem.  The  latter  differs  from  the  former  in  that  specific 
husbands  are  paired  with  specific  wives,  and  no  woman  may  be  left  in  the 
company  of  men  unless  her  husband  is  present.  Experimental  results  showed 
that  subjects  were  not  better  at  solving  one  of  these  problems  if  they  had 
previously  solved  the  other.  We  must  conclude  that ,  while  subjects  may  use 
analogies  to  help  solve  problems,  there  is  nothing  automatic  about  the 
availability  of  an  analogy,  and  subjects  may  fail  to  take  advantage  of 
analogies  unless  their  attention  is  drawn  to  them  or  they  are  made  salient 
in  some  other  way.  (Experimental  results  showing  positive  transfer  between 
problem  isomorphs  for  a  somewhat  different  type  of  problem  are  discussed  in 
Section  III. A. 3.) 

II.D.2.  Problem  Representation  in  Mathematics  and  Physics . 

Typically,  a  problem  given  in  a  mathematics  or  physics  text  describes  a 
situation,  including  quantitative  values  of  some  variables,  and  asks  for 
the  value  of  another  variable.  The  given  quantities  correspond  to  the 
initial  state  of  a  problem  and  the  unknown  quantity  provides  the  goal.  The 
problem  is  presented  in  a  natural-language  text  as  are  the  instructions  for 
novel  problems  that  we  discussed  in  the  previous  section.  The  situation 
with  a  physics  or  mathematics  problem  differs  from  a  puzzle  in  that  the 
instructions  for  the  former  do  not  provide  a  description  of  the 
problem-solving  operators  that  can  be  used.  The  student  is  assumed  to 
already  know  the  operators,  based  on  class  instruction  or  reading  the  text. 
The  interpretation  of  puzzle  instructions  is  a  representation  that  can  be 
used  by  a  general  problem-solving  system  such  as  GPS,  while  the 
intepretatlon  of  a  text  problem  in  mathematics  or  physics  is  a 
representation  that  can  be  used  by  domain-specific  problem-solving 
procedures . 
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Algebra  Word  Problems.  Word  problems  In  algebra  describe  situations 
that  can  be  translated  into  equations,  which  are  then  solved  to  find  the 
values  of  unknown  variables.  An  early  modal  of  solution  of  word  problems 
called  Student  (3obrow,  1968)  showed  that  the  translation  can  be 
accomplished  mainly  using  the  forms  of  sentences  in  the  problem  text,  and 
of  course  the  numerical  quantities,  with  very  little  knowledge  about  the 
objects  that  are  described.  For  example,  in  the  sentence  "The  number  of 
customers  Tom  gets  is  twice  the  square  of  the  number  of  advertisements  he 
runs,"  Student  does  not  need  to  know  anything  about  what  customers  or 
advertisements  are,  but  can  form  the  equation  X  =  2Y  2  using  the  function 
words  "is"  and  "of"  in  critical  ways. 

In  an  empirical  study  of  the  solving  of  algebra  word  problems,  Paige 
and  Simon  (1966)  found  a  good  deal  of  similarity  between  human  solutions 
and  those  given  by  Bobrow's  Student  program.  However,  they  found  that 
their  more  skillful  subjects  used  an  intermediate  semantic  representation 
in  the  translation  of  the  English-language  problem  statements  into 
algebraic  equations.  Some  problems  presented  descriptions  of  situations 
that  were  contradicted  implicitly  by  real-world  knowledge  (boards  of 
negative  length,  nickels  worth  more  than  quarters,  and  so  on).  The  weaker 
subjects  often  made  accurate  syntactic  translations  of  English  into 
equations,  as  Student  does,  even  though  the  equations  represented  nonsense 
situations.  The  abler  subjects  either  noticed  the  contradictions  between 
the  statements  and  their  knowledge,  or  translated  the  statements 
(carelessly)  into  equations  that  were  not  quite  equivalent  syntactically, 
but  which  represented  physically  realizable  situations. 

Another  difference  between  the  abler  and  weaker  subjects  was  that  the 
former,  but  not  the  latter,  generally  drew  diagrams  of  the  problem 
situation  that  contained  all  the  essential  relations  from  which  the 
equations  could  be  derived. 

Both  kinds  of  evidence  —  the  response  to  "impossible"  situations,  and 
the  nature  of  the  problem  diagrams  produced  —  point  strongly  to  the 
employment  by  the  more  competent  subjects  of  an  intermediate  semantic 
representation  of  the  problem  situations,  rather  than  a  direct  translation 
from  English  to  algebra. 

Arithmetic  Word  Problems.  Detailed  analyses  of  intermediate 
representations  have  been  worked  out  for  a  class  of  word  problems  in 
elementary  arithmetic.  Riley,  Greeno,  and  Heller  (1983)  and  Briars  and 
Larkin  (in  press)  have  developed  models  of  representation  and  solution  of 
word  problems  that  are  solved  by  a  single  operation  of  addition  or 
subtraction.  Examples  of  the  problems  studied  are  "Jay  had  eight  books; 
he  lost  five  of  them;  how  many  books  does  Jay  have  now?"  or  "Jay  has  some 
books;  Kay  has  seven  more  books  than  Jay;  Kay  has  eleven  books;  how  many 
books  does  Jay  have?" 

In  Riley  et  al.'s  (1983)  model,  problems  are  represented  by  three 
schemata  that  provide  knowledge  of  basic  quantitative  relationships.  One 
schema  represents  problems  involving  events  that  change  the  value  of  a 
quantity,  either  by  increasing  it  or  decreasing  it,  as  with  losing  five 
books.  A  second  schema  represents  problems  in  which  two  separate 
quantities  are  considered  in  combination.  A  third  schema  represents 
problems  involving  comparison  between  two  separate  quantities.  (This 
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classification  of  problems  is  not  unique;  similar  but  distinct 
characterizations  have  been  worked  out  by  Carpenter  and  Moser,  1982;  by 
Nesher,  1982;  and  by  Vergnaud,  1982.) 

Arithmetic  word  problems  are  usually  classified  according  to  the 
operations  used  in  their  solution,  and  children  are  often  taught  to  look 
for  certain  key  words  to  decide  how  to  solve  the  problems.  This  is 
inadequate,  because  choice  of  the  correct  operation  depends  on 
understanding  the  structure  of  quantities  in  the  problem,  rather  than  on  a 
single  feature  corresponding  to  a  key  word.  For  example,  "altogether"  is 
sometimes  suggested  as  a  key  word  for  addition,  but  this  is  not  a  reliable 
cue,  as  in  the  problem  "Jay  and  Kay  have  nine  books  altogether;  Jay  has 
seven  books;  how  many  books  does  Kay  have?" 

Riley  et  al.'s  model  simulates  children's  solutions  of  word  problems 
when  small  blocks  are  available  for  the  children  to  use  in  solving  the 
problems.  The  model  forms  representations  of  problem  texts  using  the 
schemata  of  change,  combination,  and  comparison.  Based  on  the 
representation  that  is  formed  for  a  problem,  the  model  performs 
quantitative  actions,  such  as  joining  two  sets  of  objects  together  or 
taking  a  specified  number  of  objects  away  from  a  set  and  counting  how  many 
remain.  Different  versions  of  the  model  were  formed  to  correspond  to 
different  levels  of  skill  that  were  observed  in  a  study  of  children  from 
kindergarten  through  the  third  grade.  The  versions  differ  in  the  detail 
with  which  internal  representations  are  formed  (affecting  ability  to 
retrieve  Information  from  earlier  steps)  ,  and  in  their  ability  to  perform 
transformations  that  provide  information  in  a  form  needed  to  make 
inferences.  The  patterns  of  correct  responses  and  errors  observed  in  the 
performance  of  most  of  the  children  were  consistent  with  patterns  that  were 
obtained  in  the  simulation  models. 

Briars  and  Larkin's  (in  press)  model  constructs  less  elaborate 
intermediate  representations  of  problems,  and  thus  relies  somewhat  more 
strongly  on  procedures  for  inference.  Briars  and  Larkin's  model  does  use  a 
schema  for  representing  part-whole  relations  among  sets  for  some  relatively 
difficult  problems. 

Physics  Problems.  The  knowledge  structures  used  in  simulating 
solutions  of  arithmetic  word  problems  are  quite  general,  involving 
relationships  between  quantities  that  children  probably  learn  about  in 
their  ordinary  experience.  In  technical  domains  such  as  physics,  specific 
instruction  is  given  to  teach  students  the  nature  of  theoretical  quantities 
and  the  ways  that  they  combine. 

Novak  (1977)  constructed  a  program  called  Isaac  that  builds  problem 
representations  from  English  problem  descriptions  in  a  domain  of  physics, 
simple  statics  problems.  Isaac  uses  schemata  of  physical  subsystems 
(levers,  masses,  etc.),  assumed  already  understood  by  the  solver,  to  build 
up  a  compound  schema  to  fit  the  problem  at  hand.  Thus,  it  may  assemble  a 
wall  schema  (surface),  a  floor  schema  (surface),  a  ladder  schema  (lever1), 
and  a  man  schema  (mass)  to  represent  a  situation  of  a  man  standing  on  a 
ladder  that  is  leaning  against  a  wall,  assigning  to  each  component 
appropriate  numerical  quantities  and  appropriate  connections  with  the 
others . 
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Models  such  as  Riley's  for  arithmetic  word  problems  and  Novak's  for 
physics  problems  are  based  on  the  idea  that  understanding  a  problem 
requires  schematic  knowledge  regarding  the  quantities  in  problem 
situations.  The  schemata  provide  knowledge  of  ways  in  which  quantities  are 
related  to  one  another.  These  quantitative  relations  are  not  expressed 
adequately  in  algebraic  formulas  that  are  taught  in  physics  and  other 
quantitative  sciences,  although  of  course  the  formulas  are  based  on  the 
quantitative  relations  and  students  must  be  able  to  choose  formulas  and 
assign  values  of  their  variables  correctly  on  the  basts  of  the  problem 
representations  that  they  construct. 

The  distinction  between  knowledge  of  a  formula  and  knowledge  of 
quantities  and  their  relations  is  illustrated  in  experiments  conducted  by 
'■layer  (  1974).  The  experiments  were  instructional  studies,  concerned  with 
different  methods  of  teaching  the  formula  for  binomial  probability.  One 
group  of  subjects  received  instruction  that  emphasized  calculation, 
presenting  components  of  the  formula  with  explanations  of  the  calculation 
steps,  practice  exercises,  and  relatively  brief  explanations  of  the 
referents  of  terms  in  the  formula.  Another  condition  emphasized  the 
information  needed  for  students  to  acquire  schematic  knowledge,  presenting 
definitions  of  terms  and  explanations  of  relevant  concepts  such  as  the 
number  of  combinations  and  the  probability  of  a  single  sequence  of  outcomes 
before  calculation  exercises  were  presented.  Tests  that  were  given 
following  instruction  included  a  variety  of  problems,  including  some  that 
involved  direct  application  of  the  formula,  and  others  that  required  more 
interpretation.  The  latter  group  included  word  problems,  problems  that 
could  not  be  solved  because  of  inconsistent  or  insufficient  information, 
and  problems  requiring  use  of  a  component  of  the  formula  rather  than  the 
complete  formula.  The  subjects  whose  instruction  emphasized  the  formula 
excelled  on  the  problems  involving  direct  use  of  the  formula,  but  the 
subjects  given  more  conceptual  instruction  were  more  successful  on  the 
problems  requiring  more  interpretation. 

Several  studies  have  compared  performance  of  physics  students  with 
performance  of  expert  physicists  to  identify  some  of  the  components  of 
knowledge  that  characterize  more  advanced  problem  solvers.  Three  of  the 
differentiating  characteristics  that  have  been  identified  are  (1)  use  by 
experts  of  abstract  physics  principles  in  representing  problems  as  well  as 
In  providing  methods  )f  solution;  (2)  strong  organization  of  physics 
knowledge  including  knowledge  of  relationships  among  principles  and 
recognition  of  complex  patterns  of  problem  features;  and  (3)  integration 
of  physics  knowledge  with  general  concepts  and  reasoning  processes. 

Experts'  use  of  abstract  physics  concepts  was  shown  in  experiments  by 
Chi,  Feltovich  and  Glaser  (1981),  who  gave  subjects  a  set  of  24  physics 
textbook  problems  and  asked  the  subjects  to  sort  the  problems  into  groups. 
Groupings  formed  by  advanced  graduate  students  were  based  primarily  on 
abstract  principles,  such  as  conservation  of  energy,  while  groupings  formed 
by  subjects  who  had  completed  a  single  course  in  mechanics  were  much  more 
likely  to  be  based  on  superficial  features,  such  as  the  kinds  of  objects 
(pulleys,  levers,  etc)  that  were  mentioned  in  the  problems.'  Chi  et  al. 
also  found  that  abstract  physics  principles  were  used  by  experts  when  they 
gave  protocols  reporting  their  thoughts  and  hunches  while  deciding  on  a 
"basic  approach"  to  solving  the  problem.  Use  of  abstract  principles  was 
Included  in  a  computational  model  developed  by  McDermott  and  Larkin  (1978) 
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which  simulates  representation  of  textbook  problems  by  an  expert.  The 
representation  of  a  problem  includes  a  diagram  that  represents  major 
components  and  relations,  and  then  an  abstract  representat ion  with 
theoretical  entities  such  as  forces  and  energies  and  relations  among  these 
based  on  general  principles. 

Instructional  materials  have  been  designed  by  Reif  and  Heller  (1931) 
that  provide  training  for  beginning  students  in  a  procedure  for 
constructing  abstract  representations  of  problems.  Reif  and  Heller's 
instruction  provides  an  explicit  method  for  arriving  at  a  correct  problem 
representation  like  that  used  by  experts  (although  the  representational 
method  is  not  patterned  after  the  experts'  performance,  in  which  the 
process  of  forming  the  representation  is  rapid  and  apparently  automatic, 
without  easily  discerned  intermediate  steps). 

Larkin  and  Reif  (1979)  also  designed  instruction  to  strengthen 
students'  knowledge  of  relations  among  physics  principles  and  their  ability 
to  apply  principles  in  solving  problems.  The  instruction  grouped 
principles  on  a  chart,  and  suggested  to  students  that  when  certain 
principle  are  applied,  it  is  generally  useful  to  consider  the  application 
of  other  related  principles.  Qualitative  analogies  were  also  used,  such  as 
a  fluid-current  analogy  for  electric  current  and  a  height  analogy  for 
potential.  Students  who  received  this  instruction  were  more  successful  in 
solving  test  problems  than  other  students  who  only  received  instruction  in 
the  principles,  without  the  organization  and  qualitative  analogies. 

Individuals  with  expert  knowledge  in  a  domain  have  been  shown  to  have 
superior  skill  in  recognizing  complex  patterns  of  information  in  their 
domain  of  expertise.  Domains  in  which  this  phenomenon  has  been 
demonstrated  include  chess  (Chase  &  Simon,  1973),  Go  (Reitman,  1976), 
electronics  (Egan  &  Schwartz,  1979),  computer  programming  (McKeithen, 
Reitman,  Ruiter,  &  Hirtle,  1981),  and  radiology  (Lesgold,  Feltovich,  Glaser 
6  Wang,  1981).  (We  discuss  experiments  on  chess  perception  in  Section 
III.B.2.)  Highly  developed  skill  in  pattern  recognition  may  provide  an 
explanation  for  a  finding  that  has  been  obtained  in  several  studies,  namely 
that  expert  problem  solvers  tend  to  work  forward  from  the  given  information 
to  the  unknown,  while  novices  tend  to  work  backward  from  the  unknown, 
searching  through  a  series  of  subgoals  for  formulas  that  can  provide  the 
needed  quantities  (e.g.,  Simon  &  Simon,  1978).  The  conditions  for  applying 
formulas  involve  relatively  complex  patterns  of  known  values  of  variables, 
which  experts  probably  have  learned  to  recognize  directly,  thus  avoiding 
the  more  laborious  searches  that  novices  conduct  (Larkin,  1981).  A  result 
supporting  this  view  was  obtained  by  Malin  (1979),  who  found  that  subjects 
were  more  likely  to  adopt  a  forward-search  strategy  to  solve  problems  Lf 
the  formulas  that  they  were  using  had  an  obvious  organization  than  if  the 
formulas  did  not  fit  together  in  any  evident  way. 

A  third  characteristic  of  experts'  knowledge  is  that  their 
domain-specific  knowledge  (e.g.,  in  physics)  is  Integrated  with  powerful 
general  concepts  and  procedures  for  making  inferences.  An  example  is  in 
protocols  obtained  by  Simon  and  Simon  (1978)  from  a  novice  and  an  expert 
subject  on  problems  from  a  high  school  physics  text.  One  problem  was:  "An 
object  dropped  from  a  balloon  descending  at  four  meters  per  second  lands  on 
the  ground  ten  seconds  later.  What  was  the  altitude  of  the  balloon  at  the 
moment  the  object  was  dropped?"  The  novice  subject's  solution  had  the 
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properties  of  means-ends  analysis,  using  the  formula  s  =  \>Qt  +  . 5at2.  In 
contrast,  the  expert  calculated  a  quantity  that  he  called  "the  total 
additional  velocity"  by  multiplying  the  time  by  the  gravitational  constant 
(i.e.,  10  X  9.8  =  98),  added  that  to  the  initial  velocity  to  obtain  the 
final  velocity  (98  +  a  =  102),  took  the  average  velocity 
((4  +  1 0 2 ) / 2  =  53),  and  found  the  distance  by  multiplying  the  average 
velocity  by  the  time  of  ten  seconds  (53  X  10  =  530  meters).  The  expert 
apparently  had  a  representation  of  the  problem  in  terms  of  physical 
quantities  that  enabled  him  to  apply  general  procedures  such  as  computing 
components  of  velocity  and  taking  an  average,  whereas  the  novice  was 
restricted  to  using  the  formulas  that  were  provided.  Relations  between 
technical  knowledge  and  general  concepts  have  been  investigated 
theoretically  by  deKleer  (1975)  and  by  Bundy  (1978),  who  developed  models 
of  physics  problem  solving  that  combine  general  knowledge  about  the  motion 
of  objects  on  surfaces  with  knowledge  of  formulas  in  kinematics,  and  by 
Larkin  (1982)  who  studied  the  use  of  spatial  information  in  solution  of 
hydrostatics  problems. 

Understanding  o_f  Structure  and  Principles.  Integration  of 
problem-solving  knowledge  with  general  conceptual  structures  also  has  been 
used  to  characterize  structural  understand ing ,  like  that  discussed  by 
Wertheimer  (1945/1959),  and  understanding  of  general  principles,  including 
the  relation  of  abstract  properties  of  number  (cardinality,  order, 
one-to-one  correspondence)  to  children's  cognitive  procedures  for  counting. 

Understanding  of  structur-  has  been  investigated  theoretically  by 
Greeno  (1983)  using  a  problem  discussed  by  Wertheimer  (1945/1959),  proof  of 
the  congruence  of  vertical  angles.  Wertheimer  distinguished  between  i 
relatively  mechanical  process  for  generating  the  proof,  involving  use  of 
algebra  without  cognizance  of  spatial  relations  in  the  problem,  and  i  more 
meaningful  process  based  on  part-whole  relations  between  pairs  oc  angles 
and  operations  of  removing  a  part  that  is  included  in  two  whole  angles. 
Greeno' s  model  simulates  the  more  meaningful  process  by  using  a  schema  that 
represents  part-whole  relations  in  a  general  way  and  applying 
problem-solving  operators  that  make  inferences  based  on  the  part-whole 
structure.  Data  were  available  in  the  form  of  protocols  from  students 
working  on  the  vertical-angles  problem  after  they  had  learned  to  solve 
other  problems  with  similar  part-whole  structure  involving  line  segments. 
The  model  simulates  learning  in  the  line-segment  situation.  When  the 
learned  problem-solving  operators  are  integrated  with  the  part-whole 
schema,  the  model  can  apply  its  knowledge  when  it  encounters  the 
vertical-angles  problem.  The  model  thus  provides  an  explanation  for 
transfer  that  occurs  between  problems  in  different  domains,  with  a 
characterization  of  structural  understanding  based  on  schematic 
representation.  An  account  of  transfer  based  on  acquisition  of  a  schema  in 
a  different  problem  domain  is  discussed  in  Section  III. A. 3, 

A  similar  idea  was  used  by  Resnick,  Greeno  and  Rowland  (described  by 
Resnick,  1983)  in  analyzing  children's  understanding  of  a  procedure  for 
subtraction  with  multidigit  numbers.  .According  to  their  analysis,  children 
who  understand  the  procedure  have  a  representation  that  includes  general 
relations,  such  as  part-whole  relations  between  quantities  represented  by 
individual  digits  and  the  quantities  represented  by  whole  numbers,  and 
constraints  such  as  requiring  the  quantity  represented  by  a  number  to 
remain  unchanged  when  borrowing  is  used.  The  analysis  focused  on  knowledge 
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acquired  in  meaningful  instruction  (cf.  Brownell,  1935),  in  which  children 
were  shown  the  correspondence  between  subtraction  with  numerals  and  an 
tnalogous  subtraction  procedure  using  blocks.  Resnick  et  al. 
hypothesized  that  the  understand  ing  was  achieved  through  acquisition  of  a 
schema,  involving  general  part-whole  relations,  that  was  general  enough  to 
apply  to  both  of  the  domains:  the  numerals  and  the  blocks. 

Efforts  also  have  begun  to  develop  rigorous  and  explicit 
character izations  of  knowledge  that  includes  implicit  understanding  of 
general  principles  (cf.  Judd,  1908;  Piaget,  1941/1952).  A  representation 
of  preschool  children's  understanding  of  principles  of  counting  has  been 
formulated  by  Greeno,  Riley,  and  Gelman  (in  press).  Greeno  et  al.'s 
analysis  was  based  on  evidence  presented  by  Gelman  and  Gallistel  (1978) 
that  young  children  have  significant  understanding  of  principles  such  as 
cardinality,  order,  and  one-to-one  correspondence,  rather  than  simple 
"mechanical"  knowledge  of  counting  procedures.  The  evidence  includes 
performance  in  novel  situations,  such  as  being  asked  to  evaluate  counting 
performance  by  a  puppet  that  sometimes  makes  errors,  and  counting  with  a 
novel  constraint  of  associating  a  specified  numeral  with  a  specified 
object.  Greeno  et  al .  proposed  an  analysis  of  conceptual  competence  to 
represent  children's  implicit  understanding  of  principles.  Conceptual 
principles  are  represented  as  schemata  that  incorporate  constraints  on 
correct  counting  and  express  general  properties  such  as  the  part-whole 
relation  between  the  counted  objects  and  the  whole  set.  The  conceptual 
principles  are  related  to  procedures  of  counting  by  a  set  of  planning 
rules,  which  permit  derivation  of  procedures  from  the  schematic 
representations  of  the  principles. 
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III.  Problems  of  Design  and  Arrangement 

Problems  that  we  discuss  in  this  section  require  finding  an 
arrangement  of  some  objects  that  satisfies  a  problem  criterion.  Simple 
examples  include  puzzles  in  which  the  objects  are  given  in  the  proDlem 
situation.  For  example,  an  anagram  presents  some  letters,  and  the  task  is 
to  find  a  sequence  of  those  letters  that  forms  a  word.  In  more  complex 
cases,  the  problem  solver  provides  the  materials  based  on  his  or  ner 
knowledge.  Examples  are  composing  an  essay  and  writing  a  computer  program. 

The  problem  space  in  a  problem  of  design  includes  the  objects  that  are 
given  or  are  in  the  problem  solver's  knowledge.  The  space  of  possible 
solutions  is  the  set  of  arrangements  tnat  can  be  formed  witn  tne  availaole 
objects.  The  problem  goal  is  to  construct  an  arrangement  tnat  meets  a 
criterion,  which  may  be  either  specific  or  nonspecific.  .An  anagram  problem 
has  a  specific  criterion:  the  sequence  of  .letters  should  form  a  word.  A 
written  composition  has  several  criteria  that  are  less  specific,  sucn  as 
clear  exposition,  persuasive  argument,  and  an  entertaining  style.  Many 
problems  of  design  nave  a  mixture  of  specific  and  nonspecific  criteria. 

For  example,  a  problem  in  computer  programming  can  nave  a  criterion  of  a 
specific  function  to  be  computed,  and  less  specific  criteria  such  as 
efficient  computation  and  clarity  of  structure. 

An  important  factor  in  solving  problems  of  design  is  tne  satisfaction 
of  constraints.  The  metaphor  that  best  characterizes  typical  solution 
processes  is  "narrowing  down  the  set  of  possibilities"  rather  tnan 
"searching  through  the  set  of  possibilities."  Althouga  it  is  entirely 
possiole,  as  we  snail  see,  to  describe  the  solution  process  as  a  searcn, 
the  main  steps  in  cnis  search  lead  to  the  acquisition  of  new  Knowledge  tnat 
rules  out  a  wnole  set  of  problem  states  as  potential  solutions  --  a 
wholesale  approach  to  tne  reduction  of  uncertainty.  Me  of  constraints  is 
important  because  tne  set  of  possible  arrangements  is  usually  very  large, 
compared  to  the  set  of  arrangements  that  satisfy  tne  problem  criterion. 

Problems  of  design  are  differentiated  from  transf or  nation  problems, 
discussed  in  Section  11,  in  the  nature  or  the  goal  and  tne  set  of 
alternatives  that  are  considered.  Ln  a  transformation  problem  sucn  as  tne 
Tower  of  Hanoi  or  finding  a  proof  for  a  tneorem,  tne  goal  is  a  specific 
arrangement  of  the  problem  objects,  sucn  as  a  specific  location  of  ail  the 
disks  in  the  Tower  of  Hanoi  or  a  specific  expression  to  be  proved  in  logic. 
Thus,  the  question  is  not  wnat  to  construct,  as  it  is  in  a  problem  ot 
design,  but  how  the  goal  can  be  constructed  witn  the  limited  set  of 
operators  that  are  available.  The  searcn  for  tne  solution  of  a 
transformation  problems  often  examines  one  problem  situation  after  another, 
gaining  xnowLedge  that  helps  point  the  direction  of  the  searcn  toward  tne 
goal  situation. 

Viewed  in  another  way,  however,  transformation  problems  and  problems 
of  design  are  very  similar  in  structure.  The  solution  of  a  transformation 
problem  is  a  sequence  of  actions  tnat  changes  the  initial  problem  situation 
into  the  goal.  The  solution  process  can  be  considered  as  construction  of 
an  appropriate  sequence  of  actions,  involving  search  in  tne  very  large 
space  of  possible  sequences.  This  view  emphasizes  similarities  between 
problems  of  transformation  and  problems  of  design,  wnicn  are  especially 
apparent  when  solution  of  transformation  problems  includes  planning . 
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This  section  discusses  design  problem  solving  in  four  parts.  In 
Section  III. A  we  discuss  two  simple  problems  of  forming  arrangements: 
cryptarithmetic  and  anagrams.  These  provide  paradigmatic  cases  for  the 
analysis  of  problems  of  search  among  sets  of  possible  arrangements. 

Section  III.  8  discusses  problems  in  wnich  an  arrangement  of  objects  is 
already  presented,  and  the  task  is  to  modify  the  arrangement  according  to 
some  criterion.  These  problems  include  classical  studies  of  matchstick 
puzzles  by  Katona  (1940),  and  selection  of  moves  in  board  games  where  the 
goal  of  a  player  is  to  strengthen  his  or  her  position.  In  Section  III.C  we 
discuss  so-called  "insight"  problems,  which  depend  on  finding  a  successful 
formulation  or  representation  of  the  problem.  Finally,  Section  III.D 
discusses  several  more  complex  problems  of  composition  and  design, 
including  composition  of  essays  and  musical  pieces,  design  of  procedures, 
and  formation  of  administrative  policies. 

III. A.  Simple  Problems  of  Forming  Arrangements 

First,  we  discuss  cryptarithmetic  problems,  analyzed  by  Newell  and 
Simon  (1972),  in  which  digits  are  arranged  to  form  a  correct  addition 
problem,  constrained  by  a  set  of  letters  for  wnich  the  digits  are  to  be 
substituted.  Then  we  discuss  anagram  problems,  where  letters  are  to  be 
arranged  to  spell  a  word. 

III. A. 1 .  Cryptarithmetic  Problems.  Cryptarithmetic  problems  are  best 
explained  by  exhibiting  one  of  the  best  known  examples: 

DONALD 
+  GERALD 


=  ROBERT 

The  task  is  to  replace  each  distinct  letter  in  the  array  with  a  distinct 
digit,  from  0  to  9 ,  the  same  digit  replacing  a  given  letter  in  all  its 
occurrences,  and  no  digit  being  used  for  more  than  one  letter.  To  make  the 
problem  easier,  Che  solver  is  usually  told  that  D  =  5. 

The  cryptarithmetic  task  was  apparently  first  studied  by  Bartlett 
(1958),  who  reported  some  retrospective  protocols  of  subjects  in  his  book 
on  chinking.  Subsequently,  Newell  and  Simon  (1972)  carried  out  extensive 
analyses  of  thinking-aloud  protocols  for  cryptarithmetic  problems.  From 
this  work,  we  now  have  a  rather  clear  picture  of  how  human  subjects 
approach  such  problems. 

There  are  10!  =  3,628,800  ways  of  assigning  ten  digits  to  ten 

letters.  Host  subjects,  without  calculating  this  number,  realize  that  it 
is  very  large,  and  do  not  even  attempt  to  solve  the  problem  by  maxing 
random  assignments  and  testing  them.  Instead,  they  look  for  information  in 
the  form  of  constraints  that  permit  values  to  be  assigned  to  particular 
letters  at  once.  If  that  can  be  done,  the  number  of  possibilities  declines 
rapidly.  Simply  giving  the  information  that  D  ■  5  already  reduces  the 
possible  solutions  by  a  factor  of  ten,  that  is,  to  362,880  —  still  a  large 
number ! 

The  constraints  in  cryptarithmetic  problems  that  sometimes  make 
systematic  elimation  possible  derive  from  the  fact  that  each  column  of  the 
literal  array  must  be  translated  into  a  correct  example  of  addition 
(subject  to  carries  into  and  out  of  the  column).  Thus,  as  soon  as  it  is 


Section  III,  Problems  of  Design  and  Arrangement 
Simple  Problems  of  Forming  Arrangements 


Page  6  5 


known  that  D  *  5,  the  sixth  column  can  be  processed  to  infer  that  T 
necessarily  equals  0,  and  that  there  is  a  carry  of  1  into  the  fifth  column. 
This  single  inference  reduces  the  remaining  set  of  possible  assignments  to 
40,320,  a  further  reduction  by  a  factor  of  nine. 

Similarly,  consideration  now  of  the  second  column  allows  the  subject 
to  infer  that  S  is  equal  to  0  or  9.  Since  0  has  already  been  preempted  by 
T,  we  have  E  =»  9,  reducing  the  possible  assignments  to  5,040.  A  few  more 
steps  of  reasoning,  based  on  information  contained  in  columns  one  and  five, 
allow  the  subject  to  infer  that  ft  =  7 ,  reducing  the  possible  assignments  to 
720.  An  inference  on  column  four  gives  A  “  4  (120  possibilities  remain); 
and  an  inference  on  column  five  gives  L  =  8  (leaving  only  24 
possibilities).  From  column  one,  G  =  1  (leaving  6  possibilities),  and  now 
the  remaining  digits  must  oe  assigned  to  N,  0,  and  B,  a  task  easily  carried 
out  by  trial  and  error. 

Newell  and  Simon  (1972)  obtained  thinking-aloud  protocols  of  subjects 
solving  cryptarithmetic  problems.  Problem  behavior  graphs  were 
constructed,  based  on  the  protocols,  and  a  detailed  model  of  one  subject's 
problem-solving  processes  was  developed  in  the  form  of  a  production  system. 
(A  discussion  of  this  methodology  was  given  in  Section  II.A.l.)  Tne  model 
includes  several  productions  that  represent  a  problem-solving  strategy. 
These  productions  set  goals  to  examine  a  column  or  to  examine  occurrences 
of  a  variable,  maxe  decisions  that  a  value  can  be  assigned  to  a  variable  or 
that  a  candidate  value  should  be  cested,  and  perform  other  general 
functions.  There  also  are  a  few  dozen  productions  that  represent  the 
operation  of  specific  processes.  One,  called  Process  Column,  contains  26 
productions;  others  are  considerably  simpler.  The  productions  in  this 
process  examine  the  letters  in  a  column  and  use  any  information  that  has 
been  gathered  about  them  to  make  further  inferences.  The  subject's 
performance,  recorded  in  a  problem  behavior  graph,  was  compared  in  detail 
with  the  model,  and  approximately  80Z  of  the  protocol  units  were  explained 
by  processes  in  the  model. 

Protocols  obtained  from  five  subjects  were  consistent  in  their  general 
characteristics  of  problem-solving  processes.  They  also  revealed 
significant  individual  differences,  and  these  can  be  interpreted  as 
differences  between  the  problem  spaces  of  the  individual  problem  solvers. 
All  subjects  made  use  of  their  knowledge  of  arithmetic  in  order  to  maxe 
inferences,  and  subdivided  the  problems  into  subproblems  involving  the 
columns.  There  w,;~  important  differences  among  subjects  in  their 
strategies  for  selecting  columns  to  work  on,  and  in  their  use  of  specific 
constraints  for  making  inferences. 

For  an  efficient  solution  of  this  problem,  subjects  must  use  a  search 
heuristic  of  attacking  the  most  constrained  columns  first,  for  the  most 
information  can  be  extracted  from  a  column  in  which  the  assignment  of  one 
or  more  letters  has  already  been  made,  or  in  which  the  same  letter  occurs 
twice.  Some  subjects  used  this  column  selection  heuristic  immediately; 
others  began  by  attacking  the  columns  systematically,  from  right  to  left, 
and  only  later  abandoned  that  strategy  for  the  more  powerful  one.  Subjects 
who  did  not  use  the  heuristic  usually  failed  to  solve  the  problem. 
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Another  factor  that  influenced  success  was  use  of  specific 
constraints.  The  problem  spaces  of  some  subjects  included  rules  of  parity. 
For  example,  one  of  the  inferences  needed  to  conclude  that  R  =  7  is  that 
whatever  R's  exact  value,  it  must  be  an  odd  number.  This  is  inferred  by 
processing  column  5,  containing  two  L's  whose  sum  must  be  even,  and  the 
carry  of  1,  making  the  total  an  odd  number.  Subjects  whose  problem  spaces 
did  not  include  the  parity  constraints  generally  were  unable  to  solve  the 
problems  that  they  worked  on. 

Even  subjects  who  used  the  available  heuristics  and  constraints  for 
efficient  elimination  found  the  DONALD  +  GERALD  problem  difficult.  Most  of 
their  difficulties  arose  from  one  or  both  of  two  sources.  One  such  source 
is  the  making  of  conditional  assignments  (e.g.,  "suppose  that  L  is  1"). 
Then,  if  the  assignment  was  wrong  and  they  arrived  at  a  contradiction,  they 
may  have  been  unable  to  remember  which  prior  number  assignments  they  had 
inferred  definitely  and  which  they  had  postulated  conditionally.  Another 
source  of  difficulty  involved  mistakes  made  mistakes  in  drawing  inferences, 
resulting  in  incorrect  assignments.  For  example,  some  subjects  drew  from 
the  fact  that  R  =  7  the  conclusion  that  L  =  3  (with  a  carry  from  the  sixth 
column),  ignoring  the  possibility  that  L  might  be  3,  with  a  carry  into  the 
fourth  column.  When  L  =*  3  led  to  a  contradiction,  they  often  had  great 
difficulty  in  discovering  the  cause. 

Errors  of  inference  are  forms  of  the  errors  of  syllogistic  reasoning 
that  we  discuss  in  Section  V.  In  the  particular  example  just  cited, 
subjects  appeared  to  infer  from  the  premise  "if  L  =  3  then  R  =  7"  and  the 
premise  "R  =  7"  the  conclusion,  "L  »  3,"  an  example  of  the  classical 
fallacy  of  inferring  the  antecedent  from  the  consequent.  They  did  not 
notice  that  L  =■  8  also  implies  R  =  7.  Thus,  the  cryptarithmetic  task  draws 
upon  reasoning  processes  as  well  as  upon  search  processes. 

Nothing  in  the  behavior  of  subjects  solving  cryptarithmetic  problems 
suggests  that  they  decide  consciously  to  treat  it  as  a  "constraint"  problem 
rather  than  a  "search"  problem.  In  fact,  their  behavior  can  oe  described 
as  a  search  through  the  space  of  possible  assignments,  and  Newell  and 
Simon's  analysis  used  this  point  of  view.  'What  distinguishes  it  from 
search  in  many  other  problem  spaces  is  that  the  problem  is  factored  into 
ten  separate  but  interdependent  searches  for  the  individual  assignments. 
Success  in  each  one  of  these  searches  constrains  the  problem  space  by 
reducing  the  number  of  alternative  possibilities  for  the  remaining 
assignments,  and  by  providing  additional  information  about  some  of  the 
columns.  Hence,  it  is  not  dissimilar  from  an  ordinary  search  where  each 
step  of  progress  provides  clear  feedback  of  information  that  the  right 
track  is  being  followed. 

III.A.2.  Anagrams .  Anagrams  are  strings  of  letters  that  can  be 
rearranged  to  form  words,  for  example,  thgli  — ^  light .  The  problem  space 
of  an  N-letter  anagram  contains  N!  possibilities,  hence  increases  rapidly 
with  N.  The  solution  process  can  be  viewed  as  a  search  through  this  space 
of  permutations  of  the  letters,  but  most  persons  presented  with  an  anagram 
use  various  heuristics  to  speed  up  the  search.  One  of  these  is  to  pick  out 
initial  combinations  of  letters  that  are  pronounceable  (e.g.,  _ti  or  _li  in 
the  example  above)  ,  and  then  try  to  complete  a  word  with  the  remaining 
letters.  Imposing  the  condition  of  pronunciability  upon  solution  attempts 
may  restrict  the  search  space  considerably. 
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The  course  of  the  search  also  is  much  influenced  by  the  structure  of 
long-term  memory.  For  example,  if  there  are  two  possible  solutions  for  an 
anagram,  the  one  corresponding  to  the  more  frequent  and  familiar  word  is 
likely  to  be  found  by  a  large  majority  of  subjects.  Moreover,  the  solution 
may  be  primed  by  presenting  the  word  to  the  subject,  or  a  semantically 
related  word,  some  time  before  the  anagram  task,  is  taken  up  (Dominowski  & 
Ekstrand,  1967). 

Perceptual  factors  may  affect  performance  on  anagram  tasks.  Anagrams 
that  are  already  words  (e.g.,  forth  — 4  froth)  or  are  easily  pronounced 
(e.g.,  obave  — ^  above)  take  longer  to  solve  than  those  without  these 
properties  (Beilin  &  riorn,  1962).  This  finding  is  consistent  with  Gestalt 
principles  that  "meaningful"  forms  resist  restructuring.  Gavurin  (1967) 
found  a  correlation  of  .54  between  success  in  solving  anagrams  and  scores 
on  a  standard  test  of  spatial  abilities.  When  the  subject  was  provided 
with  tiles  that  could  be  rearranged  physically,  this  correlation 
disappeared,  indicating  that  the  original  relation  had  to  do  with 
perceptual  abilities  to  operate  on  visual  or  auditory  images. 

It  is  easy  to  induce  a  problem-solving  set  in  anagram  solving  by 
presenting  subjects  with  anagrams  that  all  represent  the  same  permutation 
(say,  5  4  1  2  3)  of  the  letters.  If  an  ambiguous  anagram  (one  admitting 
several  solutions)  is  then  presented,  most  subjects  will  find  the  solution 
requiring  tne  same  permutation  rather  than  the  alternative  solution  (Rees  & 
Israel,  1935). 

Thus,  we  find  in  subjects'  behaviors  on  the  anagram  task  a  combination 
of  search  (generating  possible  solutions)  and  constraint  satisfaction 
(rejecting  non-pronounceable  initial  segments).  The  process  of  alternative 
generation,  in  turn,  is  strongly  influenced  by  long-term  memory 
organization  and  priming,  and  by  subjects'  skills  in  forming  and  holding  in 
STM  the  permutations  of  the  stimulus. 

III.B.  Problems  of  Modifying  Arrangements 

In  the  problems  discussed  in  Section  III. A,  arrangements  are  formed 
"from  scratch;"  materials  are  provided,  and  the  problem  solver  must  put 
them  together  in  a  way  that  satisfies  a  specified  criterion.  Now  we 
discuss  problems  in  which  an  arrangement  of  objects  is  presented,  and  the 
task  is  to  modify  the  arrangement.  We  will  discuss  a  laboratory  problem  of 
this  kind,  the  matchstick  problem  studied  by  Katona  (1940),  and  problems  of 
choosing  moves  in  board  games  such  as  chess  and  Go.  Perceptual  processes 
play  an  important  role  in  solution  of  these  problems,  wnich  involve 
recognition  of  general  features  and  complex  patterns  in  the  arrangements 
that  function  as  cues  for  the  selection  of  operations  and  strategic  plans. 

These  problems  combine  features  of  the  transformation  problems 
discussed  in  Section  II  with  features  of  problems  of  design.  Like  design 
problems,  a  goal  is  specified  as  a  general  criterion  rather  than  a  specific 
state  that  the  problem  solver  tries  to  produce.  At  the  same  time,  in  these 
problems  there  are  significant  restrictions  on  the  operators  that  can  be 
used  to  change  the  situation.  Therefore,  the  problems  can  be 
conceptualized  either  as  search  in  a  space  of  possible  arrangements  or  in  a 
space  of  possible  sequences  of  moves. 
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Figure  10  here 


III.B.l.  Matchstick  Problems.  One  of  the  matchstick  problems  used  oy 
Katona  (1940)  is  shown  in  Figure  10.  The  16  matches  form  five  squares; 
the  task,  is  to  move  exactly  three  matches  in  such  a  way  that  the  matches 
form  only  four  squares,  but  all  matches  are  used  as  sides  of  squares. 

Katona  tested  subjects  under  three  conditions:  a  rote  learning  condition 
(subjects  were  shown  a  specific  solution,  and  required  to  learn  it) ,  a 
condition  in  which  a  logical  condition  for  the  solution  was  taught  (that  in 
the  solution,  each  match  was  a  side  of  one  and  only  one  square),  and  a 
condition  in  which  a  heuristic  for  solving  the  problem  was  taught  ("You 
need  to  open  up  the  figure"). 

The  subjects  learned  the  solutions,  then  were  tested  on  transfer  tasks 
(different  initial  arrangements  of  the  matches  and  different  numbers  of 
squares).  Two  weeks  later  they  were  invited  bacx  and  tested  for  their 
memory  of  the  solution.  Differences  in  the  ease  of  learning  the  solution 
were  minimal,  with  the  rote  solution  being  learned  most  rapidly.  With 
respect  to  transfer  and  retention,  however,  the  "logical"  and  "heuristic" 
solutions  far  outshone  the  rote  solution,  and  the  heuristic  solution  scored 
slightly  better  than  the  logical.  Katona  concluded  from  this  evidence  that 
problem-solving  knowledge  and  skills  are  better  available  for  transfer  and 
better  retained  when  the  learning  is  meaningful  than  wnen  it  is  rote. 

The  experimental  manipulations  leave  implicit,  however,  the 
theoretical  import  of  the  term  "meaningful".  Why  does  meaningful  learning 
facilitate  retention  and  transfer,  and  why  is  the  "heuristic"  form  of  the 
instruction  somewhat  superior  to  the  "logical"  form? 

With  respect  to  transfer  and  retention,  meaningful  learning  involves 
the  same  issues  as  structural  understanding,  discussed  at  the  end  of 
Section  II. 0.2.  Transfer  is  facilitated  because  with  more  meaningful 
instructions,  subjects  acquired  knowledge  that  could  be  applied  more 
generally  —  in  particular,  to  the  new  problems  presented  in  the  test  as 
well  as  the  problems  used  in  training.  It  is  easy  to  see  why  this  would 
occur;  the  meaningful  instructions  can  be  applied  to  matchstick  problems 
generally,  while  a  specific  solution  sequence  only  applies  to  a  single 
problem. 

With  respect  to  retention,  it  may  be  that  meaningful  forms  of 
instruction  provide  more  redundancy,  hence  more  opportunity  to  recover  from 
partial  forgetting.  The  general  principles  of  single  vs.  double  function 
and  of  loosening  or  condensing  the  figure  are  constraints  that  can  be  used 
to  limit  search  for  information  in  memory,  or  to  reconstruct  solutions  that 
are  only  partially  remembered. 

The  difference  between  the  two  meaningful  procedures  appears  to  derive 
from  the  distinction  between  generators  and  tests.  The  instruction  to 
"open  up  the  figure"  provides  a  constraint  for  selection  of  an  operator  — 
it  suggests  something  to  do,  however  vague,  related  to  a  general  property 
of  the  figure  that  can  be  perceived.  The  rule,  "each  match  must  be  a  side 
of  one  and  only  one  square,"  is  a  constraint  on  solution  arrangements.  It 
provides  a  test  that  can  be  applied  to  an  attempted  solution,  but  does  not 


Figure  10.  A  matchstick  problem  used  by  Katona  (1940). 
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suggest  what  attempt  to  make  to  produce  the  solution  in  the  first  place. 

In  fact,  the  matches  that  are  actually  moved  to  solve  the  problem  are  not 
the  double-function  matches  but  matches  that  already  lie  on  the  side  of 
only  one  square.  In  this  situation,  at  least,  the  knowledge  that 
facilitates  solution  more  effectively  increases  the  selectivity  of  the  move 
generator,  rather  than  the  selectivity  of  candidate  solution  states. 

Katona  noted  that  the  heuristic  of  opening  the  figure  or  closing  gaps 
uses  a  feature  that  is  important  in  the  perception  of  form,  the  Gestalt 
principle  of  good  continuation.  Attending  to  that  feature  and  considering 
moves  to  change  an  arrangement  with  respect  to  it  constitutes  a  general 
strategy  for  solving  matchstick  problems. 

III. 8. 2.  Chess  and  Go.  Board  games  present  problems  that  have  the 
same  general  form  as  matchstick  problems.  An  arrangement  of  objects  is 
presented  —  the  current  situation  in  the  game  —  and  a  player  has  the  task 
of  selecting  a  move  or  a  move  sequence.  Sometimes  the  criterion  for  a  good 
solution  is  quite  definite  (e.g.,  "white  to  mate  in  four  moves");  more 
often,  it  is  quite  general,  involving  a  goal  to  achieve  a  stronger 
position.  Recent  experiments  have  compared  the  performance  of  individuals 
who  differ  in  their  level  of  skill.  The  results  of  these  experiments  show 
the  importance  of  knowledge  for  recognizing  large  numbers  of  complex 
patterns  that  occur  during  games.  The  relation  of  this  complex  perceptual 
knowledge  to  strategies  of  play  has  been  analyzed  in  recent  theoretical 
studies . 

In  complex  games,  as  in  other  domains  in  which  individuals  become 
expert,  problems  that  would  be  difficult  or  impossible  for  novices  are 
often  solved  "instantly"  by  experts  —  that  is,  in  a  few  seconds.  For 
example  (deGroot,  1965),  when  a  chess  grandmaster  is  presented  with  a 
position  from  an  actual  game  with  which  he  was  previously  unfamiliar,  and 
is  asked  to  recommend  a  move,  he  will  usually  be  able  to  report  a  good 
move,  often  the  best  move,  in  five  seconds  or  less.  In  a  "blitz"  game, 
where  he  is  required  to  move  within  ten  seconds,  he  will  probably  be  unable 
to  play  grandmaster-level  chess,  but  will  achieve  master  level.  With  a 
high  level  of  success,  he  will  be  able  to  play  50  or  more  opponents 
simultaneously,  taking  only  a  few  seconds  for  each  move.  When  experts  are 
asked  how  they  solve  problems  so  rapidly,  they  may  reply,  "I  use 
intuition,"  or  "I  use  my  judgment." 

The  nature  of  this  "intuition"  and  "judgment"  has  been  clarified  by 
experiments  on  skill  in  chess  by  deGroot  (1965)  and  Jongman  (1968), 
repeated  and  extended  by  Chase  and  Simon  (1973),  and  on  sxill  in  Go  by 
Reitman  (1976).  A  chessboard  with  a  position  from  a  game  (containing 
perhaps  25  pieces)  is  shown  to  subject  for  five  to  ten  seconds.  The 
subject  is  then  asxed  to  reconstruct  the  position.  Chess  grandmasters  and 
masters  can  perform  thi3  tasx  with  90  per  cent  accuracy.  Ordinary  players 
can  replace  only  five  or  six  pieces  correctly  (20  to  25  per  cent  accuracy). 
In  a  second  condition  the  task  is  the  same,  except  that  the  pieces  are  now 
arranged  on  the  chess  board  at  random,  rather  than  in  a  pattern  that  could 
have  arisen  in  a  game.  In  this  condition,  the  performance  of  masters  falls 
to  the  level  of  that  of  ordinary  players  —  both  can  replace,  on  average, 
only  about  six  pieces.  This  second  part  of  the  experiment  demonstrates 
that  the  chess  masters  do  not  have  any  special  powers  of  visual  imagery. 
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Reitman's  (1976)  study  of  skill  in  Go  bad  similar  results.  Go  is  a 
game  of  territory  played  on  a  19  X  19  grid.  The  pieces  are  round  "stones" 
tnat  are  all  tne  same  except,  of  course,  that  they  are  differant  in  color 
for  tne  two  players,  black  and  wnite.  An  experienced  suoject,  very  strong 
for  a  non-Oriental  but  not  as  strong  as  a  professional  player,  was  able  to 
reproduce  oo/.  of  the  pieces  of  meaningful  patterns,  compared  to  392  for  a 
beginner  wno  had  played  about  50  games.  On  random  patterns  the  players 
replaced  30/C  and  25/1,  or  an  average  of  five  to  seven  stones. 

(This  experimental  procedure  nas  now  Deen  used  to  study 
pattern-recognition  abilities  of  experts  in  several  other  domains;  see  our 
discussions  of  programming  in  Section  1II.D  and  electronics  in  Section 
IV. D.) 

The  behavior  of  the  cness  and  Go  experts  in  tne  perception  and  memory 
task  can  best  be  explained  as  a  function  of  their  chess  and  Go  experience. 
As  a  result  of  thousands  of  hours  of  time  spent  in  looking  at  game  boards, 
they  are  familiar  not  only  with  tne  individual  pieces,  but  with  many 
configurations  of  three,  four,  or  more  pieces  that  recur  time  and  again  in 
games.  For  example,  a  configuration  known  as  a  "f ianchettoed  castled  Black 
king's  position"  occurs  in  about  one  out  of  every  ten  games  between  expert 
cness  players.  This  configuration  is  defined  by  tne  positions  of  six 
pieces.  It  has  been  estimated  (Simon  &  Barenfeld,  1969;  Simon  & 

Gilmartin,  1973)  that  a  chess  master  has  stored  in  long-term  memory  not 
fewer  than  50,000  familiar  patterns  of  this  kind.  This  number  is 
comparable  to  tne  50,000  words  in  the  vocabulary  of  a  typical  college 
graduate,  or  perhaps  the  total  number  of  human  faces  a  gregarious  person 
learns  to  recognize  over  a  lifetime. 

Wnen  a  chessmaster  is  confronted  with  a  chessboard  on  which  the  pieces 
are  arrayed  in  a  "reasonable"  way,  he  can  store  tnis  information  in 
short-term  memory  in  a  half  dozen  or  fewer  "chunks"  —  familiar 
configurations.  The  ordinary  player,  or  the  cnessmaster  confronted  with  a 
randomly  arrayed  chessboard,  must  store  the  information  piece  by  piece, 
hence  can  nold  the  positions  of  only  about  a  half  dozen  pieces  in 
short-term  memory. 

The  skill  that  the  expert  acquires  does  not  simply  consist  in  being 
able  to  recognize  familiar  stimuli  or  configurations  of  stimuli.  As 
deGroot  showed,  the  recognition  of  perceptual  features  on  the  chess  board 
reminds  the  grandmaster  of  moves  that  are  potentially  good  wnen  those 
features  are  present.  Indeed,  we  should  expect  that  the  expert's  knowledge 
for  pattern  recognition  is  integrated  with  strategic  knowledge  so  that  the 
patterns  an  expert  has  learned  to  recognize  are  those  that  are  relevant  to 
choices  of  moves  and  plans  in  game  situations. 

The  importance  of  game  strategy  in  perception  and  representation  of 
complex  patterns  was  shown  in  an  experiment  by  Eisenstadt  and  kareev 
(1975).  Go  and  Gomoku  are  games  with  entirely  different  rules,  but  are 
played  on  the  same  board,  and  with  the  same  kinds  of  pieces.  Two  groups  of 
subjects,  who  knew  how  to  play  both  games,  were  shown  the  same  patterns  of 
stones  on  boards,  but  in  the  one  case  were  told  that  the  patterns  were  from 
a  game  of  Go,  < n  the  other  case  from  a  game  of  Gomoku.  They  were 
subsequently  asked  to  recall  the  patterns.  The  subjects  in  the  first 
condition  had  better  recollection  of  the  pieces  tnat  were  critical  to 


Section  III,  Problems  of  Design  and  Arrangement 
Problems  of  Modifying  Arrangements 


Page  72 


selecting  the  correct  move  in  the  Go  position,  while  the  subjects  in  the 
second  condition  tended  to  recall  those  tnat  were  critical  to  selecting  the 
move  in  the  Gomoku  position.  Thus,  in  the  face  of  a  complex  stimulus 
situation,  attention  to  a  particular  tasx  will  determine  the  sequence  in 
wnich  information  is  extracted  from  the  stimulus,  and  the  patterns  in  wnich 
it  will  De  organized. 

Specific  knowledge  structures  that  integrate  strategic  knowledge  and 
knowledge  for  recognizing  patterns  nave  been  studied  by  Wilkins  (1980),  in 
a  model  of  choosing  moves  in  chess,  and  by  Reitman  and  Wilcox  (1978),  in  a 
model  of  playing  Go. 

Wilkins'  (1980)  model  represents  Doard  positions  by  recognizing 
concepts,  such  as  Attack  and  Safe,  Dased  on  relations  among  pieces.  The 
model  uses  schemata  that  correspond  to  the  concepts  in  proposing  and 
evaluating  plans.  In  formulating  a  plan,  a  concept  such  as  Safe  or 
Def end-Threat  can  be  set  as  a  goal;  the  schema  for  each  concept  includes 
conditions  that  are  required  to  satisfy  tne  goal.  The  model's  strategy  of 
using  proposed  plans  to  guide  its  search  restricts  the  set  of  moves  it 
considers,  enabling  relatively  thorough  evaluations.  The  model  is 
successful  in  solving  problems  of  choosing  moves  in  middle  game  positions 
that  are  difficult  enough  to  be  used  in  a  standard  chess  textboox. 

Reitman  and  Wilcox's  (1978)  model  simulates  representation  of  board 
positions  and  changes  of  Doard  positions  in  Go.  The  model  forms  a 
multilivel  representation  with  low-level  units  such  as  strings  and  chains 
of  stones,  and  higher-level  units  called  groups  and  fields  involving 
collections  of  points  and  their  surrounding  stones.  The  representations 
include  features  that  are  relevant  to  Go  tactics,  such  as  the  stability  of 
a  group  of  stones.  Perceptual  activity  is  organized  according  to  several 
structures  including  lenses,  which  monitor  changes  on  the  board  relevant  to 
relations  between  groups  of  stones,  and  webs,  which  monitor  changes  on 
radii  and  circumferences  around  groups.  The  model's  capabilities  for 
representation,  combined  with  some  relatively  low-level  processes  for 
selecting  moves,  is  similar  to  human  player  with  experience  of  playing  40 
or  50  games. 

The  ability  of  experts  to  recognize  complex  patterns  of  information 
related  to  a  highly  integrated  structure  of  actions  has  been  found  in  other 
domains  in  which  expertise  has  been  analyzed.  We  have  discussed  the 
importance  of  knowledge  for  representing  problems  in  physics  in  Section 
II. D.  Analyses  that  we  discuss  in  Section  IV. D  of  Knowledge  for  expertise 
in  medical  diagnosis  and  electronic  troubleshooting  have  led  to  similar 
conclusions.  A  conjecture  that  is  reasonable  on  present  evidence  is  that 
high  levels  of  expertise  generally  require  a  reportory  of  tens  of  thousands 
of  perceptual  "chunks"  relevant  in  the  domain.  In  domains  where  the 
minimal  time  required  to  become  a  world-class  master  has  been  measured,  the 
estimate  turns  out  consistently  to  be  about  a  decade  (Mayes,  1971;  we 
discuss  this  finding  for  musical  composition  in  Section  III.D.i). 
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III.C.  Construction  Tasks  and  Other  Insight  Propiems . 

Much  research  attention  has  been  given  to  problems  in  wnich  some 
pnysical  device  or  arrangement  is  required,  often  to  satisfy  a  functional 
criterion.  An  example  is  Duncxer' s  (1935/1945)  famous  "tumor"  problem.  A 
patient  has  a  stomach  tumor,  wnich  is  to  be  destroyed  by  radiation  without 
damaging  the  surrounding  healthy  tissue.  How  is  it  to  be  done? 

The  source  of  difficulty  in  construction  problems  is  rather  different 
from  tne  problems  discussed  in  Sections  III. A  and  III. 8,  wnere  difficulty 
is  caused  primarily  by  the  large  numDer  of  possible  solutions.  The  tumor 
problem,  and  other  "insight"  problems,  are  difficult  primarily  because  most 
of  tne  candidate  solutions  that  people  thinx  of  are  ruled  out  by  the 
constraints  of  the  problem.  In  the  tumor  problem,  for  example,  one  cannoc 
simply  direct  tne  rays  to  the  tumor.  Decause  that  would  destroy  all  the 
tissue  along  their  path;  one  cannot  open  a  path  to  the  tumor  Decause  the 
surgical  procedures  would  cause  intoleraDle  damage,  and  so  on.  The 
"textboox"  solution  of  the  tumor  problem  is  that  the  tumor  should  be 
irradiated  from  many  different  angles,  and  hence  via  many  different  paths 
througn  the  surrounding  tissue.  Then  a  large  quantity  of  radiation  can  be 
concentrated  on  the  tumor  wnile  each  path  of  surrounding  tissue  is 
subjected  to  only  a  small  fraction  of  that  amount. 

Solving  the  tumor  problems  and  similar  insight  problems  often  depends 
on  finding  a  way  to  represent  the  problem  so  that  the  solution  becomes 
"obvious".  Achievement  of  such  a  representation,  corresponding  to  a  moment 
of  insignt,  is  a  phenomenon  of  great  interest,  especially  in  relation  to 
issues  of  cognitive  organization  in  Gestalt  psycnology.  In  problems  sucn 
as  cryptarithmetic  and  anagrams,  the  problem  space  is  easily  constructed, 
and  problem-solving  activity  consists  of  search  in  the  set  of  possibilities 
that  arise  in  that  space.  In  contrast,  in  insight  problems  such  as  the 
tumor  proolem,  tne  problem  solver's  initial  representation  usually  provides 
an  inadequate  problem  space,  one  in  which  a  solution  will  not  be  found. 
Problem  solving  involves  a  construction  of  several  problem  spaces,  with 
discovery  of  factors  that  maxe  each  of  them  inadequate,  until  a  successful 
representation  is  found.  Processes  of  problem  representation,  discussed  in 
Section  II. D,  thus  play  a  central  role  in  solution  of  these  problems  of 
construction.  The  process  can  be  characterized  as  a  searcn,  where  the 
possibilities  are  alternative  ways  to  represent  the  problem.  However,  the 
usefulness  of  such  a  characterization  is  limited  unless  che  set  of 
alternative  representations  can  be  specified  more  definitely  than  we  are 
able  to  at  present. 

Ouncker  (1935/1945)  emphasized  the  demand,  the  condition  to  be  met  by 
the  problem  solution,  as  the  chief  source  of  solution  proposals.  The 
initial  proposals  are  not  unmotivated,  but  they  are  faulty  in  not  attending 
to  all  the  conditions  a  solution  must  meet.  False  analogies  may  produce 
inadequate  solutions  because  the  analogy  does  not  match,  on  crucial 
dimensions,  the  actual  situation.  At  the  same  time,  Duncxer  stressed  that 
the  proposals  are  not  produced  by  simple  association  (page  3): 

In  short,  it  is  evident  that  such  proposals  are  anything  but 
completely  meaningless  associations.  Merely  in  tne  factual  situation, 
they  are  wrecked  on  certain  components  of  the  situation  not  yet  known 
or  not  yet  considered  by  the  subject. 
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Occasionally  it  is  not  so  mucn  the  situation  as  the  demand,  wnose 

distortion  or  simplification  mantes  the  proposal  practically  useless. 

3y  constructing  a  taxonomy  of  correct  and  inadequate  solutions  co  the 
tumor  problem,  Duncker  showed  now  the  solution  generation  process  can  be 
understood  as  a  process  of  means-ends  analysis,  His  taxonomy  can  be 
depicted  in  outline  form: 

Treatment  of  tumor  by  rays  without  destroying  healtny  tissue 
Avoid  contact  between  rays  and  healtny  tissue 
Use  free  path  to  stomach 
Use  esopnagus 

Remove  healthy  tissue  from  patn  of  rays 
Insert  a  cannula 

Insert  protective  wall  between  rays  and  tissue 
Feed  substance  that  protects 
Displace  tumor  toward  surface 
By  pressure 

Desensitize  the  healthy  tissue 

Inject  desensitizing  chemical 
Immunize  by  adaptation  to  weax  rays 
Lower  intensity  of  rays  through  nealthy  tissue 

Postpone  full  intensity  until  tumor  is  reached 
Use  weax  intensity  in  periphery,  strong  near  tumor 
By  use  of  lens 

Duncxer  described  the  solution  process  as  successive  development  or 
reformulation  of  the  problem.  Both  worxing  forward  and  worxing  backward 
may  contribute  to  the  process.  Seeing  a  sticx  may  give  a  chimpanzee  the 
clue  to  obtaining  a  banana  that  is  out  of  reach.  Alternatively,  the 
banana's  being  out  of  reach  may  lead  the  chimpanzee  to  look  for  objects 
that  could  oe  used  to  reach  it  (cf.  Kohler ,  19Z9).  Mistakes  may  also  call 
attention  to  features  of  tne  problem  situation  that  must  be  incorporated  in 
tne  solution  —  hence  may  lead  to  new  solution  attempts. 

From  the  idea  that  problem  solution  depends  on  an  appropriate 
formulation,  it  would  be  expected  that  hints  could  be  used  to  maxe  problems 
significantly  easier.  One  experiment  on  the  effect  of  hints  used  a  problem 
of  constructing  a  hatrack,  invented  by  Maier  (1945).  Two  sticks  and  a 
clamp  were  given,  and  the  hatracx  could  be  constructed  by  clamping  the 
sticks  together  so  they  could  be  wedged  between  the  floor  and  the  ceiling. 
Subjects  usually  began  by  placing  one  stick  on  the  floor,  clamping  the 
other  stick  to  it  vertically,  or  in  an  X  or  inverted  V  shape  with  one  end 
of  each  stick  resting  on  the  floor.  Neither  of  these  structures  is  stable. 
If  the  experimenter  3aid,  "In  the  correct  solution,  the  clamp  is  used  as 
the  coat  hanger,"  solution  was  facilitated  somewnat ,  mainly  by  reducing 
attempts  with  one  sticx  lying  on  the  floor.  If  the  experimenter  said,  "In 
the  correct  solution  the  ceiling  is  part  of  the  construction,"  solution  was 
facilitated  more  strongly,  with  reduction  both  of  attempts  that  have  one 
sticx  on  the  floor  and  of  attempts  that  have  one  end  of  each  stick  on  the 
floor  (Burke,  Maier  4  Hoffman,  1966). 
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A  potential  source  of  problem  solutions  is  analogy  with  similar 
problems.  Gick  and  Holyoak  (1980)  ga ve  Duncker' s  tumor  problem  to  suojects 
some  of  wnom  had  studied  a  story  in  which  a  fortress  is  ratten  by  a 
converging  attack.  The  subjects  wno  were  familiar  with  the  military 
problem  were  more  successful  than  control  subjects  in  solving  the  tumor 
problem.  An  important  factor  was  use  of  an  instruction  that  the  story 
might  provide  a  useful  hint  for  solving  the  problem.  With  the  hint,  most 
subjects  found  the  convergence  solution  to  the  tumor  problem,  but  without 
the  hint  only  aoout  one-half  as  many  subjects  found  that  solution,  even 
tnough  they  had  read  the  story  and  recalled  it  in  a  test. 

In  a  subsequent  study,  Gick  and  Holyoax  (1981)  examined  conditions 
favoring  spontaneous  use  of  an  analogy.  There  was  little  effect  of  asking 
subjects  to  summarize  the  military  story,  rather  than  to  recall  it,  nor  did 
giving  a  verpal  statement  or  a  diagram  showing  the  convergence  principle 
notably  increase  subjects'  use  of  the  analogy.  However,  more  solutions 
were  given  oy  subjects  wno  read  two  stories  involving  convergence , 
summarized  ootn  of  tnem,  and  discussed  ways  in  wnich  the  stories  were 
similar.  Gick  and  Holyoax  concluded  chat  subjects  acquired  a  schema  with 
the  idea  of  convergence  represented  in  a  general  way,  and  that  use  of  sucn 
a  scnema  is  more  lixely  chan  use  of  a  specific  analogous  problem.  (Recall 
the  use  of  a  similar  hypothesis  in  interpreting  learning  a  subtraction 
procedure  with  understanding,  based  on  an  analogy  with  olocxs,  discussed  in 
Section  II. D. 2.) 

Duncker  (1913/1945)  also  studied  problems  that  required  subjects  to 
construct  or  assemble  some  item  out  of  potential  components  that  were 
provided.  He  showed  that  the  problems  could  be  made  difficult  by 
presenting  one  of  the  components  in  such  a  way  that  it  was  conceptually 
"unavailable"  for  its  required  function.  In  one  problem,  for  example,  the 
ouilding  materials  were  a  candle,  matches,  and  a  oox  full  of  thumbcacxs. 

The  tasx  was  Co  mount  the  candle  on  a  wall  so  that  it  could  burn  without 
dripping  wax  on  the  floor.  The  problem  could  be  solved  by  thumbtacxing  the 
oox  to  tne  wall,  then  mounting  the  candle  in  it. 

This  problem  is  sufficiently  difficult  that  fewer  than  half  the 
subjects  in  one  experiment  were  able  to  solve  it  in  20  minutes  (Adamson, 
1952).  When  the  problem  was  presented  to  another  group  of  subjects  with 
the  thumbtacks  lying  on  a  table,  and  the  box  empty,  86  per  cent  solved  it 
in  less  than  20  minutes.  The  phenomenon  underlying  this  finding  has  been 
labeled  "functional  fixity."  When  an  object  is  performing  one  function,  or 
has  recently  been  used  to  perform  some  function,  subjects  are  less  lixely 
to  recognize  its  potential  use  for  another  function. 

iJirch  6  Rabinowitz  (1951)  demonstrated  a  similar  phenomenon,  using 
another  problem  studied  originally  Dy  Maier  (1931).  Suojects  were 
introduced  to  a  room  where  two  strings  were  hanging  from  a  ceiling,  coo  far 
apart  to  be  reached  simultaneously.  The  tasx  was  co  tie  them  together. 

This  could  be  accomplished  if  a  heavy  object  were  tied  to  one  string  and 
Che  string  swung  as  a  pendulum.  The  subject  could  reach  this  string  as  it 
swung  toward  him  or  ner  wnile  he  or  she  was  grasping  the  other  string.  Two 
objects,  an  electric  switch  and  a  relay,  were  available  for  constructing 
the  pendulum.  The  suojects  nad  used  either  the  switch  or  the  relay  (but 
not  both)  in  a  previous  tasx.  Of  ten  subjeccs  who  nad  used  the  relay 
previously,  all  used  the  switcn  to  construct  the  pendulum;  of  nine  wno  had 
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used  tne  switch,  seven  used  the  relay  to  construct  the  pendulum.  Jf  six 
subjects  wno  had  used  neither  object  previously,  tnree  used  the  switcn  and 
three  the  relay  to  construct  the  pendulum. 

Several  findings  support  a  hypothesis  that  functional  fixity  results 
from  a  decrease  in  the  livelihood  of  noticing  certain  critical  features  of 
objects  in  the  situation,  such  as  the  flatness  of  a  dox  (in  use  as  a 
container)  ,  or  the  heaviness  of  a  switch  (after  use  in  a  circuit) ,  or  the 
mobility  of  a  string.  Tne  mechanisms  that  produce  decreased  noticing  of 
features  in  functional  fixity  may  be  quite  different  in  different 
situations,  involving  restrictive  hypotheses  about  general  classes  of 
solutions  in  some  cases,  and  simple  competition  between  feature-recognition 
processes  in  others. 

Some  of  the  findings  that  support  tnis  explanation  involve 
demonstrations  that  solution  of  problems  can  be  strongly  influenced  by 
quite  low-level  perceptual  factors.  For  example,  in  the  pendulum  taste,  tne 
idea  of  maxing  the  one  string  swing  to  maxe  it  reachable  oy  someone  holding 
the  other  does  not  occur  readily  to  most  subjects,  even  in  the  presence  of 
one  or  more  neavy  objects.  Maier  (19J1)  showed  that  this  idea  occurred 
immediately  to  many  subjects  wno  had  not  previously  thougnt  of  it  when  tne 
experimenter  casually  orusned  against  the  string  and  set  it  swinging. 
Glucxsberg  and  .-Jeisoerg  (L96o)  presented  pictures  of  the  materials 
availaole  for  use  in  solving  Duncxer's  candle  problem,  and  found  that 
solutions  were  markedly  increased  wnen  the  label  "dox"  was  included  in  tne 
picture.  A  process  of  noticing  features  of  objects  that  can  De  related  to 
the  problem  goal  (Duncxer's  "suggestions  from  below")  proDaoly  plays  a 
significant  role  in  solution  of  construction  problems,  as  Jeisoerg  and  Suls 
(1971)  concluded  in  their  theoretical  analysis  of  solution  processes  for 
tne  candle  proolem.  Results  consistent  with  tnat  idea  were  obtained  by 
Magone  (1977),  wno  found  that  subjects  produced  a  greater  variety  of 
solutions  in  Maier' s  two-string  problem  if  they  were  initially  prompted  to 
consider  features  of  objects  tnan  if  they  were  initially  prompted  to  find  a 
solution  of  a  specified  kind,  suen  as  extension  of  one  of  the  strings  or 
causing  a  string  to  swing  back  and  forth. 

The  Einstellung  effect  discussed  in  section  II. 1.2  is  similar  in 
character  to  functional  fixity;  both  effects  result  from  the  influence  of 
previous  experience  upon  the  availability  of  alternative  solution  steps  for 
problems.  Furtnermore,  the  processes  responsible  for  the  two  effects 
probably  are  analogous  in  a  suotle  but  significant  way.  loth  involve  a 
condition  in  which  a  form  of  search  is  made  less  lively  than  it  would  be 
normally.  in  the  case  of  Einstellung,  previous  use  of  one  solution  path 
suppresses  a  search  for  problem-solving  operators.  In  the  case  of 
functional  fixity,  a  search  for  features  of  oojects  that  could  be  useful  in 
the  problem  is  suppressed . 

Another  "insight"  problem  that  has  Deen  studied  is  the  nine-dot 
problem.  A  three-by-three  matrix  of  dots  is  given,  and  the  task  is  to 
connect  all  the  dots  with  four  straignt  lines  without  any  retracing.  The 
problem  is  difficult;  most  subjects  do  not  think  of  drawing  lines  outside 
the  space  defined  by  the  matrix  of  dots,  as  is  required  for  the  solution. 
The  difficulty  is  apparently  another  instance  of  a  restricted  domain  of 
search,  but  the  obvious  hypothesis  of  a  restriction  based  on  the  spatial 
arrangement  is  not  supported  by  data.  ^eisberg  and  Alba  (1981)  instructed 
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subjects  that  they  should  draw  lines  outside  Che  square  of  dots,  out  that 
nad  little  effecc.  However,  when  they  gave  an  easier  problem  requiring 
drawing  lines  oeyond  the  region  that  contained  dots,  solution  of  tne 
nine-dot  problem  was  facilitated.  A  reasonaoie  interpreter  ion  is  that  tne 
easier  problem  led  the  suojects  to  consider  proolem-so Iving  operators  mat 
were  not  in  tne  proolem  space  of  suojects  wno  did  not  solve  the  simpler 
problem  first.  This  finding  involves  the  same  principles  as  the  finding  of 
Xatona  (1940;  Section  III.  3.1)  tnat  a  heuristic  for  choosing  operators 
is  more  effective  man  a  test  applicaole  to  the  results  of  operators. 

III.D.  More  Complex  Tasxs  of  Composition  and  Design . 

In  mis  section  we  discuss  analyses  of  several  complex  tasxs  of 
composition  and  design.  w'e  begin  witn  composition  of  written  essays  and 
music,  tasks  requiring  metnods  for  handling  multiple  constraints.  Then  we 
discuss  two  tasks  of  answering  examination  questions  and  designing  policies 
in  wnich  the  problem  solvers'  recognition  and  knowledge  of  constraints  play 
an  important  role  in  successful  performance.  Third,  we  discuss  design  of 
procedures,  in  which  the  materials  are  actions  that  can  be  performed  and 
the  task  is  to  use  tnose  actions  to  construct  a  procedure,  such  as  a 
compucer  program,  that  satisfies  a  criterion. 

III.D. I.  Problems  of  Composition,  we  discuss  an  analysis  of 
composing  written  essays,  developed  using  thinrting-aioud  protocols  oy 
Flower  and  Hayes  (i960),  and  an  instructional  study  of  writing  oy  3ereiter 
and  Scardamalia  (1982).  Then  we  briefly  describe  a  study  of  composition  in 
music  by  leitman  (1965),  and  data  that  show  tne  need  for  extensive  training 
in  a  domain  as  a  condition  for  creative  accomplishments  (Hayes,  1981). 

Flower  and  Hayes  (1980)  have  studied  tne  cask  of  writing  an  essay,  a 
task  in  which  constraints  play  a  major  role.  They  noted  that  successful 
writing  requires  simultaneous  compliance  with  a  large  number  of 
constraints,  operating  at  iifferent  levels.  Dne  requirement  is  selection 
and  organization  of  ideas  from  the  writer's  knowledge  into  a  conerent 
networ.x  of  concepts  and  information  for  inclusion  in  the  essay.  .Another 
sec  of  constraints  involves  tne  linguistic  and  discourse  conventions  of 
■written  language.  A  third  sec  of  constraints  is  rhetorical,  involving  the 
need  to  arrange  the  essay  so  it  accomplishes  tne  writer's  purpose  for  tne 
intended  audience. 

Using  protocols  obtained  from  subjects  wor.xing  on  writing  tasxs,  Hayes 
and  Flower  (1980)  found  three  general  processes:  planning,  translating, 
and  reviewing.  These  three  processes  allow  tne  writer's  attention  to  a 
subset  of  tne  constraints  at  any  time.  In  planning,  information  is 
generated  from  the  proolem  solver's  memory  relevant  to  the  topic,  and 
decisions  are  made  aoout  what  to  include.  In  translating,  text  is  produced 
■using  information  that  nas  been  retrieved,  consistent  with  a  writing  plan 
that  nas  oeen  formed.  In  reviewing,  tne  generated  text  is  evaluated  and 
revised  in  accord  with  rhetorical  purposes  and  constraints  of  text 
structure,  as  well  as  more  detailed  linguistic  concerns  such  as  correct 
grammar .  Hayes  and  Flower  found  that  -writing  involves  a  mixture  of  these 
processes,  and  postulated  that  the  writing  process  includes  a  monitor  that 
determines  the  sequence  of  suoprocesses ,  depending  on  the  nature  of 
difficulties  that  arise. 
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To  write  successfully,  an  individual  must  understand  the  constraints 
that  apply  at  various  levels  to  the  text,  must  have  effective  methods  for 
generating  or  revising  text  to  conform  to  those  constraints,  and  must 
actively  engage  in  evaluation  with  respect  to  the  constraints.  In  studies 
of  young  writers,  Bereiter  and  Scardamalia  (1932)  noted  that  inattention  to 
constraints,  especially  global  rhetorical  concerns,  characterizes  the 
writing  of  many  children.  When  they  revise  text  that  they  have  produced, 
most  children  attend  exclusively  to  low-level  constraints,  usually  changing 
only  single  words  or  small  phrases,  rather  than  attempting  to  improve  more 
significant  general  features  of  their  essays.  Bereiter  and  Scardamalia 
hypothesized  that  the  difficulty  lies  in  the  process  of  evaluating  the 
text,  rather  than  in  a  lack  of  understanding  of  rhetorical  goals  or  a  lacx 
of  effective  means  far  producing  improved  text.  They  gave  students  a  set 
of  cue  cards  witn  evaluative  comments,  such  as  "I  need  another  example 
here,"  "The  reader  won't  be  convinced  by  tnis,"  "Even  I  seem  to  be  confused 
here,"  and  "This  is  a  good  sentence."  The  children's  task,  was  to  choose  a 
card  that  seemed  appropriate  for  each  sentence  in  their  texts,  and  to  maxe 
appropriate  changes.  The  technique  was  effective,  consistent  with  the  idea 
that  children  have  difficulty  managing  the  process  of  evaluating  their 
texts  and  applying  global  constraints,  rather  than  lacking  Knowledge  of  tne 
constraints  and  methods  for  complying  with  them. 

Multiple  interacting  constraints  also  characterize  composition  of 
music,  as  was  shown  in  an  analysis  by  Reitman  (1965)  based  on  a  protocol 
obtained  from  a  professional  composer  as  he  wrote  a  fugue.  Reitman  noted 
that  schematic  structures  that  he  called  transformational  formulas  played 
an  important  role;  these  included  knowledge  of  the  main  components  of  the 
musical  form  being  composed  (exposition,  development,  and  conclusion)  as 
well  as  suocomponents  of  those  units  (expositions^  thematic  material  + 
co  unte  mate  rial ;  thematic  material  =>  motive  +  development,  etc.).  He 
found  Chat  much  problem-solving  activity  was  concerned  with  constraints. 
Some  constraints  were  generated  by  properties  of  the  instrument  (piano) 
chosen  for  the  piece,  requiring  musical  material  suited  to  the  instrument. 
Other  constraints  were  produced  by  material  already  included  in  the  piece, 
such  as  a  requirement  that  countermaterial  should  be  compatible  with 
thematic  material,  but  sufficiently  different  to  provide  interest.  The 
composer  characterized  patterns  that  he  developed  as  conventions,  producing 
melodic,  rhythmic,  and  instrumental  properties  that  were  then  "used  to 
carry  on  the  movement  of  the  music"  (Reitman,  1965,  p.  169),  with 
variations  introduced  to  maintain  interest. 

A  substantial  knowledge  base  is  required  for  solving  problems  of 
composition,  and  an  important  question  is  how  much  experience  and  training 
an  individual  needs  to  make  substantial  creative  contributions  to  a  field 
such  as  musical  composition.  Using  data  from  biographies  and  a  standard 
catalogue  of  recordings,  Hayes  ( 1 9 » 1 )  determined  the  time  between  a 
composer's  beginning  serious  musical  training  and  the  first  composition 
that  had  five  independent  recordings  in  the  catalogue.  In  almost  every 
case,  at  least  ten  years  of  virtually  full-time  training  occurred  before  a 
composer  produced  a  work  of  such  high  quality  that  it  is  common  in  the 
recorded  repertoire. 
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ILI.D.2.  Recognition  and  Knowledge  of  Constraints.  In  problems  tnat 
require  satisfaction  of  constraints,  a  problem  solver  must  recognize  tne 
constraints  in  order  to  perform  successfully.  In  Section  III. A.  I  we 
discussed  Newell  and  Simon's  (1972)  finding  that  individual  differences  in 
cryptar ithmetic  depended  on  inclusion  in  their  problem  spaces  of 
significant  constraints,  such  as  odd-even  parity.  Now  we  discuss  two  more 
studies  that  have  investigated  this  factor.  In  a  study  of  performance  on 
examination  questions,  Bloom  and  Broder  (1950)  found  that  some  students 
performed  poorly  because  they  failed  to  recognize  constraints  stated  in  the 
questions.  In  a  study  of  problems  involving  administrative  policy,  Voss, 
Greene,  Post,  &  Penner  (1983)  found  that  novice  problem  solvers  gave 
inadequate  formulations  of  problems  because  they  lacked  knowledge  of 
significant  factors  in  the  problem  situation. 

In  comprehensive  college  examination  questions  studied  by  Bloom  and 
Broder  (1950),  students  often  were  required  to  make  inferences  or  deal  with 
information  presented  in  an  unusual  form.  For  students  who  performed 
poorly,  a  significant  factor  was  the  students'  inattention  to  constraints 
in  the  statements  of  some  questions.  For  example,  a  question  might  ask  a 
student  to  choose  the  best  explanation  of  a  situation,  but  the  student 
would  ignore  the  relation  of  alternative  answers  to  the  situation  and  picx 
tne  one  that  seemed  most  nearly  true  in  itself.  For  these  students,  the 
activity  of  problem  solving  occurred  in  a  problem  space  that  lacked  some  of 
the  information  that  was  required  for  good  performance.  Bloom  and  Broder 
developed  an  instructional  method  in  which  students  compared  their  own 
problem-solving  process,  recorded  in  a  thinking-aloud  protocol,  with  the 
process  of  another  student  whose  performance  was  more  successful.  This 
training  was  effective  for  many  students ,  teaching  them  to  attend  more 
carefully  to  constraints  in  questions  as  well  as  other  helpful 
problem-solving  strategies ,  such  as  increased  attempts  to  infer  plausible 
answers  from  information  they  could  retrieve  from  memory. 

Voss  et  al .  (19xx)  obtained  thinxing-aloud  protocols  on  problems 

involving  design  of  an  administrative  policy.  For  example,  problem  solvers 
were  asked  to  develop  a  policy  for  improving  agricultural  productivity  in  a 
region  of  the  Soviet  Union.  Subjects  with  different  amounts  of  knowledge 
about  Soviet  government  and  history  worked  on  the  problem,  including 
students  in  an  introductory  course  in  Soviet  politics,  experts  in  political 
science,  some  who  specialized  in  the  Soviet  Union  and  others  with  other 
specialties,  and  experts  in  another  field  altogether,  chemistry.  The 
solution  process  of  experts  was  primarily  one  of  formulating  the  problem, 
with  a  long  initial  period  of  considering  historical  and  political  factors 
and  successive  reformulations  based  on  evaluations  of  proposed  solutions 
against  Known  constraints.  The  inexpert  student  subjects,  with  much  less 
'Knowledge  than  the  experts,  gave  problem  formulations  that  lacked  inclusion 
of  important  constraints.  Experts  in  chemistry  worked  more  systematically 
than  the  political  science  students ,  sometimes  using  general  knowledge 
about  administrative  systems  to  provide  useful  conjectures,  but  also  lacxed 
the  rich  formulations  that  characterized  the  problem  solvers  with 
specialized  knowledge. 

III.D.3.  Design  of  Procedures.  To  complete  this  section,  we  discuss 
tasks  in  which  the  materials  are  a  set  of  actions  that  can  be  performed, 
and  the  problem  solver  constructs  a  procedure  with  these  component: .  In  a 
simple  example,  a  list  of  errands  was  shown  to  subjects,  along  with  map, 
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and  the  subjects  constructed  a  schedule  for  performing  as  many  of  the 
errands  as  possible  during  a  day  (Hayes-Roth  &  Hayes-Roth,  1978).  >tore 
complex  examples  are  computer  programming  and  software  design,  and  design 
of  experimental  procedures  in  microbiology,  'vie  have  mentioned  the 
similarity  of  these  problems  to  problems  of  transformation,  discussed  in 
Section  II,  especially  wnen  planning  is  used  for  constructing  a  sequence  of 
actions  to  reach  the  problem  goal.  The  knowledge  structures  of  experienced 
problem  solvers  that  have  been  analyzed  in  domains  of  software  design  and 
design  of  experimental  procedures  are  very  similar  to  the  domain-specific 
knowledge  used  in  planning,  for  example  in  the  domain  of  geometry  discussed 
in  Section  II. 3.1. 

Hayes-Roth  and  Hayes-Roth  (1978)  gave  subjects  a  map  of  a  fictitious 
town,  showing  the  locations  of  several  stores  and  other  businesses. 

Subjects  also  were  given  a  list  of  errands,  such  as  buying  fresn  vegetables 
at  the  grocery,  picking  up  medicine  for  a  dog  at  the  vet,  and  seeing  a 
movie.  Subjects  were  to  plan  a  schedule  that  included  as  many  of  the 
errands  as  possible.  The  task  presents  some  general  constraints,  mainly  a 
limited  total  amount  of  time  available.  It  also  presents  local  constraints 
and  interactions.  For  example,  it  is  better  to  buy  groceries  later  in  the 
day,  so  they  will  still  be  fresh  when  the  shopper  returns  home;  and  it  is 
best  to  go  to  the  movie  at  one  of  the  times  that  the  feature  is  starting. 
Interactions  include  proximity  of  shops,  making  it  more  efficient  to  place 
errands  together  in  the  sequence  that  involve  shops  that  are  near  one 
another . 

Hayes-Roth  and  Hayes-Roth  simulated  performance  on  their  planning  task 
with  a  model  that  contains  several  planning  specialists  and  a  blackboard 
control  structure,  a  design  similar  to  one  used  earlier  in  a  speech 
understanding  system  called  Hearsay  (Reddy,  Erman,  Fennell,  &  Neely,  1973). 
The  specialists  are  designed  to  make  suggestions  about  different  kinds  of 
planning  decisions.  They  all  have  access  to  inferences,  suggestions,  and 
other  information,  which  is  located  in  the  system's  blackboard.  This 
system  design  supports  a  feature  called  opportunistic  planning,  which  was 
found  in  the  performance  of  human  problem  solvers.  Opportunities  arise  in 
the  form  of  conditions  that  make  it  easy  to  include  an  errand,  such  as  the 
proximity  of  a  store  to  a  place  that  is  already  included  in  the  plan,  and 
an  appropriate  specialist  can  be  activated  by  that  condition. 

In  writing  a  computer  program,  one  designs  a  procedure  that  performs  a 
designated  function.  Studies  of  computer  programmers  and  designers  have 
revealed  important  characteristics  of  the  knowledge  required  for  solution 
of  these  design  problems. 

Soloway,  Ehrlich,  3onar  and  Greenspan  (1982)  gave  three  problems, 
typical  of  elementary  programming  courses,  to  students  in  the  first  and 
second  introductory  courses  in  programming.  They  identified  schematic 
cognitive  structures  that  they  called  plans,  needed  for  successful  problem 
solving.  The  required  schemata  are  quite  basic,  involving  construction  of 
iterative  loops  and  use  of  variables.  The  schemata  provide  knowledge  of 
requirements  for  performing  significant  program  functions,  such  as 
interactions  between  processing  and  testing  a  variable  within  a  loop  and 
between  the  loop  processing  and  initialization.  Students  wno  lacked 
adequate  versions  of  these  schemata  made  significant  errors,  for  example, 
failing  to  recognize  distinctions  between  different  looping  structures. 
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Experiments  on  memory  for  program  texts  have  shown  that  experienced 
programmers  can  recall  more  successfully  than  beginners  (Adelson,  1981; 
McKeithen,  Reitman,  Rneter,  &  dirtle,  1981)  a  phenomenon  like  that  found 
for  memory  of  chess  board  positions  (Section  II. 3.2)  and  other  domains. 

The  acquisition  of  plan  schemata  as  hypothesized  by  Soloway  et  al. 
provides  a  natural  explanation  of  this  finding. 

More  advanced  problems,  involving  software  design,  were  studied  Dy 
Poison,  Atwood,  Jeffries,  a  Turner  (1981).  A  task,  in  software  design 
involves  planning  a  complex  program;  actual  writing  of  the  program  is 
performed  separately.  A  problem  studied  by  Poison  et  al .  was  design  of  a 
program  for  compiling  an  index  for  a  text,  given  a  set  of  key  words  to  be 
included  in  the  index.  Problem  solutions  with  thinking-aloud  protocols 
were  given  oy  professional  software  designers  and  by  students.  The  experts 
recognized  functions  that  had  to  be  included  in  the  solution,  such  as 
defining  a  data  structure  for  the  text  and  searching  tne  key  word  set  for  a 
word  tnat  matches  each  word  encountered  in  the  text.  Poison  et  al . 
concluded  that  experts'  knowledge  includes  general  design  schemata  that 
enable  decomposition  of  problems,  progressively  forming  more  well-defined 
subproblems,  with  knowledge  of  specific  techniques  for  some  subproblems 
that  are  encountered.  These  schemata  provide  another  example  of  Knowledge 
for  action  organized  hierarchically  like  that  developed  by  Sacerdoti  (1977; 
Section  II. B) . 

Problems  of  experimental  design  have  been  studied  in  the  domain  of 
microbiology;  two  versions  of  a  program  called  Molgen  have  been  developed. 
One,  by  Stefik  (1981),  designs  procedures  for  modifying  the  genetic 
structure  of  microorganisms.  An  important  issue  considered  by  Stefik  was 
the  handling  of  constraints  that  arise  from  interactions  between 
componennts  of  a  procedure.  Molgen  designs  procedures  in  a  top-down 
manner,  with  abstract  pian  schemata  gradually  made  more  specific.  A  method 
of  constraint  posting  was  developed  in  which  requirements  for  one  of  the 
design  components  can  be  taken  into  account  in  the  decisions  made  about 
other  components. 

The  second  version  of  Molgen,  by  Friedland  (1979),  designs  analytic 
experiments  such  as  determining  the  sequence  of  base  molecules  in  a  DNA 
string  or  finding  a  set  of  restriction  sites  on  a  molecule.  This  model 
uses  schemata  called  skeletal  plans  that  incorporate  information  about 
experimental  procedures  that  are  used  to  develop  specific  experimental 
plans  through  a  process  of  filling  in  details,  based  on  the  specific 
problem  requirements. 


In  this  section  we  discuss  inductive  problem  solving.  In  a  problem  of 
induction,  some  material  is  presented  and  the  problem  solver  tries  to  find 
a  general  principle  or  structure  that  is  consistent  with  the  material. 
Important  examples  include  scientific  induction,  including  situations  in 
which  the  material  is  a  set  of  numerical  data  and  the  task  is  to  induce  a 
formula  or  a  molecular  structure,  language  acquisition,  where  the  material 
is  a  set  of  sentences  and  the  task  is  to  induce  the  rules  of  grammar  for 
the  language,  and  diagnosis,  where  the  material  is  a  set  of  symptoms  and 
the  task  is  to  induce  the  cause  of  the  symptoms.  Analogy  problems  and 
problems  of  extrapolating  sequences  are  inductive  tasks  that  are  widely 
used  in  intelligence  tests.  The  task  of  inducing  a  rule  for  classifying 
stimuli  into  categories  has  been  used  in  a  large  and  significant  body  of 
experimental  study. 

An  induction  problem  presents  a  dual  problem  space  that  includes  a 
space  of  stimuli  or  data  and  a  space  of  possible  structures,  such  as  rules, 
principles,  or  patterns  of  relations  (cf-.  Simon  &  Lea,  1974).  The  task 
can  be  conceptualized  as  a  search  in  the  space  of  structures.  The  problem 
is  solved  by  finding  a  structure  that  satisfies  a  criterion  of  agreement 
with  the  stimuli  or  data.  An  experimental  subject  can  be  tested  by 
requiring  use  of  the  structure  for  stimuli  that  have  not  yet  been  shown. 
When  the  task  is  to  induce  a  rule  for  classifying  stimuli,  new  stimuli  may 
be  presented  to  test  whether  the  subject  can  classify  them  correctly.  When 
the  task  is  to  induce  a  pattern  in  a  sequence,  the  subject  may  be  required 
to  extend  the  sequence  by  producing  additional  elements  that  fit  the  same 
pattern  as  those  that  are  given. 

Solving  an  induction  problem  can  proceed  in  two  ways,  and  in  most 
tasks  a  combination  of  the  methods  is  used.  The  first  is  a  top-down 
method.  It  involves  generating  hypotheses  about  the  structure  and 
evaluating  them  with  information  about  the  stimulus  instances.  Second, 
there  is  a  bottom-up  method  that  involves  storing  information  about  the 
individual  stimuli  and  making  judgments  about  new  stimuli  on  the  basis  of 
similarity  or  analogy  to  the  stored  information.  To  perform  the  top-down 
method,  the  problem  solver  requires  a  procedure  for  generating  or  selecting 
hypotheses,  a  procedure  for  evaluating  hypotheses,  and  then  a  way  of  using 
the  hypothesis  generator  to  modify  or  replace  hypotheses  that  are  found  to 
be  incorrect.  To  use  the  bottom-up  method,  the  problem  solver  needs  a 
method  of  extrapolating  from  stored  information,  either  by  judging 
similarity  of  new  stimuli  to  stimuli  stored  in  memory  or  by  forming 
analogical  correspondences  with  stored  information. 

Induction  involves  a  form  of  understanding,  where  a  representation  is 
found  that  provides  an  integrated  structure  for  diverse  stimuli.  This 
general  feature  also  characterizes  processes  of  representing  problems 
(e.g.,  textbook  physics  problems)  discussed  in  Section  II. D.  There,  the 
space  of  stimuli  is  the  information  in  the  problem  situation  —  often,  a 
problem  text  or  Instructions  —  and  the  space  of  structures  is  a  set  of 
possible  representations  that  can  be  constructed.  To  be  successful,  a 
problem  representation  must  provide  the  information  needed  to  achieve  the 
problem  goal.  Thus,  in  representing  transformation  problems,  the  inductive 
search  is  constrained  by  the  requirements  of  problem-solving  operators  that 
are  available.  Often  in  problems  of  induction,  such  constraints  are  not 
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present:  one  does  not  have  to  do  anything  with  the  pattern  that  is  found 
in  the  i nf ormation .  However,  in  some  inductive  problems,  such  as  medical 
diagnosis,  there  are  strong  constraints  related  to  available  operators. 

The  goal  is  to  restore  the  ailing  person  to  proper  functioning,  and  the 
effort  to  induce  a  cause  serves  the  goal  of  determining  an  effective 
remedy. 

In  some  task  domains,  the  possible  structures  are  represented 
explicitly  as  formulas.  Examples  include  induction  of  quantitative 
formulas  from  numerical  data  in  physics,  or  induction  of  the  molecula 
structure  of  a  chemical  compound.  Patterns  induced  in  letter-sequence 
problems  also  consist  of  explicit  formula-like  rules.  These  tasks  share 
important  properties  with  problems  of  design  and  arrangement,  discussed  in 
Section  III.  The  goals  of  these  induction  tasks  can  be  considered  as 
design  of  a  formula  that  agrees  with  the  data.  Solution  of  design  problems 
generally  requires  use  of  strong  constraints  to  limit  the  space  of 
possibilities  for  search,  and  this  important  property  is  also  found  In 
tasks  that  involve  induction  of  formulas. 

Our  discussion  of  inductive  problem  solving  will  be  in  four 
subsections.  In  Section  IV. A,  we  discuss  induction  of  categorical 
concepts,  including  induction  of  rules  for  classifying  stimuli  and 
categorical  concepts  in  the  form  of  prototypes.  In  Section  IV. B  we  discuss 
induction  of  more  complex  concepts  involving  sequential  stimuli:  patterns 
in  sequences  of  letters  and  the  grammatical  rules  of  a  language.  Section 
IV. C  discusses  induction  of  relational  structure  including  analogy  test 
items  and  induction  of  regularities  and  structures  in  science  and 
mathematics.  Finally,  Section  IV. D  discusses  diagnostic  problem  solving, 
including  medical  diagnosis  and  electronic  trouble-shooting. 

IV. A.  Categorical  Concepts 

Of  the  various  inductive  tasks  that  have  been  studied,  by  far  the 
greatest  attention  has  been  given  to  induction  of  categorical  concepts. 

This  is  partly  in  recognition  of  their  practical  importance.  Our  human 
capability  of  organizing  experience  using  conceptual  categories  undoubtedly 
contributes  much  to  making  our  cognitive  lives  manageable. 

In  an  experiment  on  concept  induction,  the  experimenter  constructs  a 
set  of  stimuli  (e.g.,  diagrams  with  figures  that  vary  in  shape,  size, 
color,  and  other  attributes),  and  decides  on  a  rule  for  classifying  the 
stimuli  (e.g.,  "the  red  circles  are  positive,  all  other  stimuli  are 
negative").  The  subject  is  given  information  about  several  individual 
stimuli  —  that  is,  is  told  whether  each  stimulus  is  positive  or  negative. 
The  subject's  task  is  to  induce  the  rule  of  classification.  Usually,  the 
experimenter  tests  whether  the  subject  has  induced  the  concept  by 
presenting  new  stimuli  to  determine  whether  the  subject  can  classify  them 
correctly. 

In  an  early  discussion,  Woodworth  (1938)  distinguished  between 
processes  of  concept  induction  involving  bottom-up  and  top-down  methods. 

In  a  bottora-up  process,  knowledge  of  the  concept  is  analogous  to  a 
composite  photograph,  consisting  of  an  impression  summed  over  the  various 
stimuli  in  the  category  with  the  common  features  emphasized  and  the 
variable  characteristics  "washed  out."  In  a  top-down  process,  the 
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problem-solver  actively  constructs  hypotheses  about  features  that  define 
the  concept  and  tests  these  hypotheses  with  additional  information  about 
examples . 

We  discuss  research  on  concept  induction  in  three  subsections.  The 
first  two  discuss  studies  of  top-down  processes  of  inducing  concepts  that 
are  defined  by  two  or  more  stimulus  features  and  then  of  concepts  defined 
by  a  single  feature.  The  third  subsection  discusses  studies  of  bottom-up 
processes  ">f  inducing  concepts. 

Figure  11  here 


IV.A.l.  Multifeature  Concepts.  When  two  or  more  stimulus  features 
are  combined  to  form  a  categorical  concept,  they  are  combined  in  some 
logical  formula,  such  as  "A  and  B,"  or  "If  A  then  B."  A  stimulus  is  a 
positive  example  of  the  concept  if  the  formula  is  true  as  a  description  of 
the  stimulus.  Consider  the  set  of  stimuli  shown  in  Figure  11.  The  concept 
"Green  and  Circle"  specifies  the  stimuli  in  the  second  column  from  the 
left.  The  concept  "Green  or  Circle"  specifies  the  stimuli  that  are  in 
columns  1,  2,  3,  5,  and  8. 

Consider  the  requirements  for  performance  of  this  task,  assuming  that 
it  is  done  in  a  top-down,  hypothesis-testing  manner.  First,  the  stimulus 
features  must  be  discriminated;  the  problem  solver  must  have  processes  for 
recognition  of  the  features  that  are  used  to  define  concepts.  Second, 
there  must  be  a  process  for  hypothesis  formation ,  which  constructs 
candidate  hypotheses  to  be  considered.  Third,  a  process  of  hypothesis 
evaluation  is  needed  to  test  the  hypotheses  that  have  been  formed. 

Finally,  a  process  for  hypothesis  modification  is  required  to  use  the 
results  of  the  tests  to  eliminate  incorrect  hypothesis,  to  change  existing 
hypotheses  or  form  new  ones. 

A  landmark  study  of  multifeature  concept  induction  was  conducted  by 
Bruner,  Goodnow,  and  Austin  (1956).  Bruner  et  al.  observed  subjects 
working  on  concept  induction  problems,  including  verbal  reports  about  their 
hypotheses.  The  results  we  discuss  here  were  from  experiments  in  which 
subjects  were  instructed  that  concepts  were  conjunctions  of  features,  but 
had  to  induce  how  many  features  were  relevant  and  what  the  features  were. 

We  consider  two  experiments  in  which  the  problem  was  to  induce  a  concept 
defined  as  the  conjunction  of  two  features. 

In  one  experiment,  subjects  were  required  to  solve  two  problems  with 
the  array  of  Figure  11  shown,  and  a  third  problem  of  the  same  kind  "from 
memory,"  that  is,  with  the  stimuli  not  available.  In  each  case,  the 
problem  began  with  the  experimenter  providing  a  positive  instance,  a 
stimulus  that  was  a  member  of  the  concept  category.  Then  the  subject  could 
choose  any  stimulus  in  the  display  and  ask  whether  it  was  a  positive  or 
negative  instance  of  the '"concept .  The  subject  could  offer  a  hypothesis 
after  the  choice  of  a  card,  but  this  was  not  required.  The  subject 
continued  choosing  cards  and  receiving  information  until  the  correct 
concept  was  induced. 
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Bruner  et  al.'s  results  included  characterizations  of  a  variety  of 
strategies  used  by  subjects  in  selecting  stimuli.  Strategies  of  one  kind, 
called  focussing  strategies,  involve  finding  a  positive  instance  of  the 
concept,  then  determining  which  of  its  features  are  relevant.  For  example, 
suppose  the  concept  was  "Red  and  Circle."  The  subject  might  be  told  that 
the  stimulus  with  three  red  circles  and  two  borders  is  a  positive  instance. 
The  subject  then  could  choose  a  stimulus  differing  from  that  focus  stimulus 
in  the  number  of  circles,  say,  two  red  circles  with  two  borders.  This 
would  be  a  positive  instance,  and  the  subject  would  infer  that  the  number 
of  figures  is  not  a  relevant  attribute.  Then  the  subject  might  vary  the 
color  of  the  figures,  choosing  the  stimulus  with  three  green  circles  and 
two  borders.  This  would  be  a  negative  instance,  and  the  subject  would 
infer  that  the  color  of  the  figures  is  relevant,  that  is,  that  "Red"  is 
part  of  the  definition  of  the  concept.  With  further  choices  and 
information,  the  concept's  definition  would  be  inferred. 

Other  strategies,  called  scanning  strategies,  involve  consideration  of 
specific  hypotheses  and  use  of  information  to  narrow  down  the  set  of 
possible  hypotheses.  For  example,  a  subject  might  consider  as  distinct 
possibilities  the  hypotheses  "Three  figures,"  "Red,"  Three  and  Red,", 
"Circle,"  "Three  Circles,"  and  "Red  Circles."  Then  finding  that  a  stimulus 
with  two  red  circles  and  two  borders  is  a  positive  instance,  all  the 
hypotheses  with  the  property  "Three"  could  be  eliminated.  Use  of  a 
scanning  strategy  places  severe  demands  on  memory.  It  is  impossible  to 
consider  all  of  the  possible  hypotheses  simultaneously  (there  are  255  of 
them),  but  it  is  desireable  to  consider  as  many  as  one  can,  since 
information  can  only  be  used  to  evaluate  hypotheses  in  the  sample  being 
considered . 

The  focussing  strategies  and  the  scanning  strategies  differ  primarily 
in  the  processes  they  use  in  formation  of  hypotheses.  In  the  focussing 
strategies,  information  about  instances  is  used  to  constrain  hypothesis 
formation.  Tests  are  performed  to  see  whether  an  attribute  is  relevant, 
and  when  the  attribute  is  eliminated,  no  hypothesis  using  it  will  be 
formed.  If  the  focussing  strategy  is  used  successfully,  all  but  the 
correct  attributes  can  be  eliminated,  and  the  correct  hypothesis  can  be 
formed  directly.  In  the  scanning  strategies,  less  use  of  problem 
information  is  used  in  forming  hypotheses,  and  hypotheses  that  are  in  the 
sample  are  tested  directly  with  information  about  instances.  The  use  of 
information  in  evaluating  hypotheses  is  somewhat  more  direct  in  the 
scanning  strategies,  but  there  is  a  consequently  greater  requirement  for 
record-keeping  in  memory  regarding  a  large  set  of  hypotheses. 

There  were  12  subjects  in  Bruner  et  al.'s  experiment,  and  their 
performance  was  used  to  classify  them  as  focussers  and  scanners.  A  subject 
was  classified  as  a  focusser  if  the  majority  of  his  or  her  choices  differed 
in  just  one  attribute  from  features  of  the  focus  stimulus  that  had  been 
found  relevant  or  were  as  yet  untested,  or  involved  specific  variations  of 
this  selection  process,  including  redundant  tests  or  attempts  to  test  more 
than  one  attribute  with  a  stimulus.  Seven  subjects  were  classified  as 
focussers  and  the  rest  were  treated  as  scanners.  The  focussing  strategy 
was  advantageous  for  the  subjects  who  used  it.  They  required  about 
one-half  as  many  choices  as  the  scanners  to  solve  a  problem  with  the 
stimulus  array  present  (medians  of  five  and  ten  choices,  respectively,  for 
focussers  and  scanners).  Further,  the  scanners  had  noticeably  greater 
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difficulty  solving  a  problem  "in  their  heads"  than  they  did  when  the 
stimuli  were  present  (median  of  13  choices),  except  for  one  scanner  who 
discovered  the  focussing  strategy  in  working  on  the  third  problem.  The 
focussers'  performance  without  stimuli  present  did  not  differ  from  their 
second  problem  with  the  stimuli. 

3runer  et  al .  conducted  two  experiments  to  investigate  situational 
factors  that  influenced  subjects'  choices  of  strategies.  One  experiment 
compared  the  effect  of  an  orderly  arrangement  of  stimuli  with  the  same 
stimuli  presented  haphazardly.  The  stimuli  used  abstract  forms,  differing 
on  six  dimensions  with  two  values  on  each  dimension.  With  the  64  stimuli 
arranged  systematically,  similar  to  the  arrangement  in  Figure  11,  almost 
all  subjects  used  focussing  strategies.  '.•Then  stimuli  were  not  arranged 
systematically,  subjects  typically  used  scanning  strategies.  Another 
condition  in  which  there  was  a  tendency  to  use  scanning  strategies  was  when 
concrete  stimuli  were  used,  such  as  drawings  of  persons  varying  in  sex, 
size,  and  clothing. 

An  analysis  by  Hunt  (  1962;  Hunt,  Marin  Si  Stone,  1966)  provided  a 
hypothesis  about  how  to  represent  categorical  concepts  in  cognitive 
structure.  Hunt  proposed  that  knowledge  of  a  categorical  concept  is  a 
cognitive  procedure  for  deciding  whether  a  stimulus  is  or  is  not  a  member 
of  the  category.  The  form  of  the  procedure  that  Hunt  investigated  is  a 
decision  network,  a  structure  of  perceptual  tests  organized  in  a  way  that 
reflects  the  logical  structure  of  the  concept.  (This  same  form  was  used  by 
Feigenbaum  (1963)  for  the  Elementary  Perceiver  and  Memorizer,  used  in 
simulations  of  rote  verbal  memorizing.  Examples  of  such  decision  networks, 
for  recognizing  some  concepts  in  geometry  problems,  are  in  Figures  5  and 
6.)  Experiments  conducted  by  Trabasso,  Rollins  &  Schaughnessy  (1971) 
provided  evidence  that  supports  Hunt's  characterization.  Trabasso  et  al. 
measured  latencies  for  categorical  decisions  about  stimuli  and  obtained 
results  that  agreed  with  Hunt's  model;  longer  times  were  required  for 
decisions  in  which  the  model  specifies  a  larger  number  of  perceptual  tests. 
A  model  that  simulates  acquisition  of  conjunctive  concepts  was  developed  by 
Williams  (1971)  using  Hunt's  representational  hypothesis  along  with 
assumptions  about  limited  short-term  memory  capacity  and  changes  in  the 
salience  of  dimensions. 

An  important  aspect  of  the  acquisition  of  complex  concepts  is 
induction  of  the  logical  relation  between  the  stimulus  features  in  the 
definition.  This  has  been  studied  by  Bourne  and  his  associates  in 
experiments  in  which  subjects  are  informed  of  the  features  that  the  rules 
include.  For  example,  a  subject  might  be  told  that  the  rule  includes  "Red" 
and  "Circle,"  but  then  would  have  to  discover  from  examples  whether  the 
combination  is  conjunction,  disjunction,  conditional,  or  biconditional. 

When  subjects  are  not  experienced  in  this  rule  learning  task,  there  are 
substantial  differences  in  the  difficulty  of  inducing  the  different  kinds 
of  rules,  and  these  correspond  to  differences  among  the  types  of  rules 
found  in  standard  concept  induction  tasks  (Haygood  &  Bourne,  1965).  The 
rule  that  is  learned  most  easily  is  the  conjunctive  rule. 

One  possible  explanation  for  differences  in  difficulty  is  that  the 
rules  differ  in  familiarity  to  the  subjects,  with  conjunction  being  the 
most  familiar  way  to  combine  features.  This  would  lead  to  a  bias  in  the 
process  of  forming  hypotheses,  with  the  less  familiar  forms  of  hypothesis 
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generated  later,  if  at  all,  and  consequent  delays  in  problem  solutions. 
Evidence  supporting  such  an  interpretation  was  obtained  by  Bourne  (1970), 
who  found  that  differences  among  the  rule  forms  decreased  when  subjec  s 
were  given  a  series  of  rule  induction  problems.  A  more  specific 
hypothesis,  proposed  by  Bourne  (1974),  is  that  with  experience,  subjects 
acquire  a  strategy  for  representing  information  about  stimuli  in  terms  of 
truth-table  values  regarding  the  features  known  to  be  relevant.  For 
example,  if  "Red"  and  "Circle"  are  the  features,  then  a  red  circle  has  the 
value  T-T  (true  on  both  atributes),  a  green  circle  has  the  value  F-T,  and 
so  on.  This  is  an  efficient  representation  for  solving  concept-induction 
problems,  because  each  of  the  alternative  rule  forms  corresponds  to  a 
distinctive  subset  of  truth-table  values.  A  conjunctive  rule  is  satisfied 
only  by  T-T;  a  disjunctive  rule  is  satisfied  by  T-F,  F-T,  and  T-T;  a 
conditional  rule  is  satisfied  by  T-T,  F-T,  and  F-F ;  and  a  biconditional 
rule  is  satisfied  by  T-T  and  F-F.  The  truth-table  hypothesis  is  supported 
by  a  finding  by  Dodd,  Kinsman,  Klipp  and  Bourne  (1971)  that  training  on  a 
task  of  sorting  stimuli  into  the  four  categories  of  the  truth  table 
facilitated  subsequent  performance  on  rule  induction  problems. 

IV. A. 2.  Single-Feature  Concepts.  We  now  consider  induction  of 
conceptual  rules  consisting  of  single  features,  such  as  "all  the  red 
pictures,"  or  "the  circles."  The  task  of  inducing  such  a  concept  is 
simpler,  of  course,  than  inducing  a  multifeature  concept. 

Evidence  for  Top-Down  Induction.  Single-feature  concept  induction  has 
been  studied  extensively  by  H.  H.  Kendler  T.  S.  Kendler,  and  their 
associates.  One  question  addressed  in  their  experiments  is  whether 
concepts  are  acquired  in  the  form  of  a  verbalized  rule  or  in  the  form  of  an 
aggregation  of  individual  stimulus-response  connections.  It  is  likely  that 
i  verbalized  rule  would  result  from  a  top-down  hypothesis-testing  process 
of  induction,  and  an  aggregation  of  stimulus-response  connections  would 
probabLy  result  from  a  bottom-up  process. 

Evidence  has  been  obtained  in  experiments  in  which  the  conceptual 
category  is  changed  without  informing  the  subject.  A  subject  is  given  an 
initial  concept-induction  problem  involving  a  single  stimulus  feature 
(e.g.,  "respond  positively  to  red  stimuli").  After  the  subject  meets  a 
criterion  of  correct  responding  the  rule  is  changed,  either  changing  the 
positive  value  of  the  same  attribute  (e.g.,  from  red  to  green),  called  a 
reversal  shift,  or  changing  to  a  different  attribute  (e.g.,  from  red  color 
to  large  size),  a  nonreversal  shift.  It  was  found  that  adult  human 
subjects,  and  kindergarten  children  who  solved  the  initial  problem  quickly, 
adjusted  more  easily  to  the  reversal  than  to  the  nonreversal  shift  (Buss, 
1953;  Kendler  &  D'Amato,  1955;  Kendler  &  Kendler,  1959),  while  rats  and 
slower-learning  kindergarten  children  adjusted  more  quickly  to  the 
nonreversal  shift  (Kelleher,  1956).  An  interpretation  is  that  adults  and 
school-aged  children  use  a  hypothesis  s:-h  as  "it  depends  on  color,"  which 
does  not  have  to  be  changed  to  adjust  the  reversal  shift,  while  nonhuraan 
subjects  and  preschool  children  learn  specific  stimulus-response 
associations,  for  which  the  reversal  shift  requires  a  greater  change.  In  a 
later  study,  Erickson  (1971)  found  that  college-student  subjects  adjusted 
more  rapidly  to  nonreversal  shifts  if  they  had  been  carefully  instructed 
about  the  nature  of  the  concept  induction  task,  suggesting  that  when 
subjects  have  more  complete  information  about  the  task,  they  tend  to  remove 
stimulus  attributes  from  consideration  when  their  hypotheses  are 
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Farther  evidence  that  adult  human  performance  in  concept  induction  is 
based  on  definite  hypotheses  Jas  obtained  by  Levine  (1963)  who  showed  that 
on  a  series  of  test  trials  with  no  feedback  given,  nearly  all  of  the 
sequences  of  responses  given  by  college-student  subjects  were  consistent 
with  a  systematic  hypothesis  about  the  conceptual  rule. 

Processes  of  Sampling  Hypotheses .  The  processes  of  forming  and 
evaluating  hypotheses  in  single-feature  concept  induction  are  quite 
straightforward.  Any  stimulus  feature  that  is  noticed  can  be  the  basis  of 
a  rule,  and  a  rule  that  links  a  feature  with  a  response  is  confirmed  or 
negated  directly  by  information  about  the  category  of  any  example.  Because 
the  hypotheses  are  simple,  and  there  are  many  possible  hypotheses,  it  is 
efficient  for  subjects  to  consider  samples  of  hypotheses  rather  than  a 
single  hypothesis  at  a  time.  A  sample  of  a  few  hypotheses  is  considered, 
and  on  each  trial  the  subject  can  eliminate  hypotheses  that  are 
inconsistent  with  the  information  given  about  that  trial's  stimulus.  If 
the  sample  includes  the  correct  hypothesis,  the  process  of  elimination  can 
narrow  the  sample  down  to  that  hypothesis,  which  solves  the  problem.  If 
the  sample  does  not  include  the  correct  hypothesis,  eventually  all  the 
hypotheses  in  the  sample  will  be  eliminated  and  the  subject  will  have  to 
generate  another  sample.  Note  that  this  method  is  similar  to  the 
strategies  that  3runer  et  al.  (1936)  called  scanning.  Like  the  scanning 
strategies,  the  strategy  of  testing  samples  of  hypotheses  is  demanding  on 
memory . 


Theoretical  discussions  have  included  many  proposals  about  processes 
of  choosing  hypotheses  to  consider,  eliminating  hypotheses  based  on 
stimulus  information,  and  remembering  previously  eliminated  hypotheses. 
Several  of  the  proposals  have  been  discussed  by  Gregg  and  Simon  (1967)  and 
by  Millward  and  Wickens  (1974). 

Results  obtained  by  Wickens  and  Millward  (1971)  provide  support  for  an 
assumption  that  experienced  subjects  remember  stimulus  attributes  that  they 
have  eliminated.  According  to  Wickens  and  Millward's  model,  if  the  sample 
of  hypotheses  is  exhausted,  the  attributes  of  eliminated  hypotheses  are 
stored  in  memory.  Memory  limitations  apply  both  to  the  sice  of  the  imple 
that  can  be  considered  and  to  the  number  of  previously  eliminated 
attributes  that  can  be  remembered.  In  Wickens  and  Millward's  experiment, 
subjects  received  extensive  training  in  concept  induction,  solving  many 
problems  with  the  same  set  of  stimuli,  with  different  attributes  used  to 
define  the  concept  in  the  successive  problems.  Performance  improved 
sharply  after  the  first  problem  or  two,  and  stabilized  within  10-20 
problems.  The  model  of  attribute-elimination  was  supported  by  statistical 
data  as  well  as  by  subjects'  responses  to  a  retrospective  ques t lonna i re . 
Differences  in  performance  among  the  individual  subjects  can  be  explained 
by  assuming  that  they  all  performed  in  iccord  with  the  model's  assumptions, 
hut  that  they  differed  In  the  sizes  of  the  hypothesis  samples  that  they 
considered  and  in  their  capacities  for  remembering  previously  eliminated 
hypo  theses . 


Section  IV,  Induct  ion 
Categorical  Concepts 


Page  90 


When  performance  of  inexperienced  subjects  has  been  analyzed  using 
stochastic  models,  the  results  have  revealed  a  surprisingly  simple 
structure  of  the  problem-solving  process.  Rnstle  (1962)  investigated 
mathematical  properties  of  a  process  in  which  a  sample  of  hypotheses  is 
considered  by  the  subject,  and  on  each  trial  a  response  is  chosen  using  one 
of  the  hypotheses.  In  Restle's  model  it  is  assumed  that  subjects' 
processing  of  information  differs  depending  on  whether  the  response  on  a 
trial  happens  to  be  correct.  After  each  correct  response,  hypotheses  that 
are  inconsistent  with  the  information  about  that  trial's  stimulus  are 
eliminated  from  the  sample.  After  an  error,  the  subject  considers  a  new 
sample  of  hypotheses.  A  simple  stochastic  process  results  if  it  is  assumed 
that  sampling  occurs  with  replacement.  If  this  assumption  is  made, 
solution  of  the  problem  is  an  all-or-none  event;  the  probability  of 
solving  the  problem  with  no  more  errors  after  taking  a  sample  is  a 
constant,  independent  of  the  number  of  trials  or  errors  that  have  occurred 
previously.  This  implication  is  counterintuitive.  If  we  assume  that  the 
subject  is  sampling  and  testing  hypotheses,  it  says  that  there  is  no 
accumulation  of  information  over  trials  that  makes  sampling  of  the  correct 
hypothesis  more  likely.  The  all-or-none  property  also  is  incompatible  with 
almost  any  assumption  of  learning  stimulus-response  associations  th  it  ir? 
strengthened  gradually  over  trials,  as  well  as  the  summative  or  "composite 
photograph"  process  that  Woodworth  (1938)  discussed. 

The  counterintuitive  all-or-none  property  of  Restle's  model  rec  jived 
strong  empirical  support  in  experiments  by  Bower  and  Trabasso  (1969). 

Their  experiments  with  college-student  subjects  included  conditions  in 
which  the  categorical  rule  was  changed  before  the  subject  solved  the 
problem,  either  using  a  reversal  shift  or  a  nonreversal  shift.  The 
assumption  of  resampling  with  replacement  after  errors  predicts  that  shifts 
prior  to  solution  should  not  delay  the  solution  of  the  problem,  and  this 
surprising  result  was  obtained. 

Computer  simulation  models  of  the  concept  induction  tssk,  using 
different  hypothesis  generating  strategies,  have  been  proposed  by  Gregg  and 
Simon  (1967).  They  showed  that  these  process  models  can  be  aggregated 
(approximately)  into  simple  stochastic  models  like  Restle's  (1962), 
providing  an  informat lon-processing  explanation  for  the  simple  statistical 
regularities  implied  by  the  stochastic  models  and  found  in  Bower  and 
Trabasso' s  (1964)  data.  Gregg  and  Simon  found  that  a  range  of  different 
models  is  required  to  account  for  the  set  of  experiments  reported  by  Bower 
and  Trabasso.  According  to  these  models,  the  nature  of  sampling  depends 
primarily  upon  how  much  information  the  subjects  can  retain  about  the 
classification  of  previious  instances,  and  about  which  hypotheses  had 
already  been  refuted  by  the  evidence.  In  general,  the  process  models  that 
fit  the  data  best  were  those  that  implied  very  severe  restrictions  on 
short-term  memory  for  previous  instances  and  their  classification.  Given 
this  restriction  on  memory,  the  models  are  consistent  with  the  all-or-none 
property  —  that  is,  the  expected  number  of  trials  to  solve  the  problem  is 
Independent  of  the  time  the  subject  has  already  spent  on  it. 

IV. A. 3.  Bottom-Up  Induction  of  Concepts .  In  addition  to  inducing 
categorical  concepts  in  a  top-down,  hypothesis-based  manner,  induction  also 
can  be  a  bottom-up  process,  involving  gradual  emergence  of  the  concept  from 
the  features  of  individual  stimuli.  This  idea  has  received  less  attention 
in  psychological  research,  but  it  has  not  been  totally  missing  from  the 


Section  IV,  Induction 
Categorical  Concepts 


Page  91 


d iscussion . 

Hull  (1920)  conducted  a  study  of  learning  in  which  the  materials  were 
pseudo-Chinese  ideograms  paired  with  nonsense  syllables.  The  stimuli 
paired  with  the  same  response  syllable  from  list  to  list  all  shared  a 
stimulus  component,  a  radicaL  that  was  part  of  each  of  the  stimuli.  Hull's 
subjects  showed  positive  transfer  on  the  later  lists  in  the  experiment, 
indicating  that  they  had  induced  the  concepts  to  some  extent.  However, 
they  typically  were  not  aware  of  the  feature  or  features  that  were  shared, 
indicating  that  they  were  not  actively  testing  hypotheses  about  the 
categorical  rules.  It  seems  likely  that  the  subjects  stored  information 
about  the  individual  stimulus-response  pairs,  and  gradually  built  up 
impressions  that  included  the  shared  components. 

A  result  similar  to  Hull's  was  obtained  by  Reber  (1967),  who  studied 
induction  of  rules  for  an  artificial  language.  Reber  constructed  sequences 
of  letters  using  a  set  of  grammatical  rules:  for  example,  "Start  with  a  T 
or  a  V,"  "After  an  initial  T,  use  a  P  or  another  T,"  and  "After  a  V  that  is 
not  at  the  beginning,  use  a  ?  or  end  the  sequence."  The  sequences,  from  six 
to  eight  letters  long,  were  used  in  a  learning  task  in  which  subjects  were 
shown  the  sequences  and  had  to  recall  them.  Subjects  working  on  the 
grammatical  sequences  learned  faster  than  subjects  who  worked  on  a 
comparable  set  of  random  letter  sequences.  After  learning  a  set  of 
grammatical  sequences,  subjects  were  able  to  discriminate  between  new 
grammatical  sequences  and  sequences  that  violated  the  g rammer  with  more 
than  75*  accuracy.  Even  so,  subjects  were  not  aware  of  the  rules  that  were 
used  to  form  the  grammatical  sequences,  and  showed  little  awareness  of 
their  shared  features. 

In  recent  research  and  discussion,  Rosch  (e.g.,  1978)  has  argued 
persuasively  that  much  if  our  conceptual  knowledge  is  not  organized  on  the 
basis  of  definite  feature  structures,  like  those  used  in  most  experiments 
on  induction  of  categorical  rules.  First,  Rosch,  Mervis,  Gray,  Johnson, 
and  3oyes-3rain  (1976)  argued,  with  empLrical  support,  that  concepts  at 
different  Levels  of  generality  are  not  equal  in  salience,  but  that  there 
are  basic  categories  whose  members  share  large  numbers  of  features  that,  are 
not  shared  by  membei  >  of  other  categories,  including  characteristic 
patterns  by  which  we  interact  with  them  motorically.  For  example,  "chair," 
"table,"  and  "hamner"  refer  to  basic  categories,  while  their 
superordinates,  "furniture"  and  "tool,"  and  their  subordinates,  such  as 
"picnic  table"  and  "claw  hammer,"  are  less  fundamental.  Data  supportive 
this  distinction  were  obtained  by  Rosch  et  al.,  who  gave  subjects  a  s,»r 
of  90  terms  and  asked  tnem  to  write  all  the  attributes  that  e me  t o  n .  . 

Another  group  subjects  was  given  the  same  terns  and  asked  t  i  vr i  •  • 
descriptions  of  muscle  movements  that  they  would  make  in  i  iter  .  •  :  - 

the  ob  ects.  Many  more  attributes  and  movements  were  is  i  >  •  ■  •  - 
basic-level  terms  than  their  superordinates,  and  very  f  .>  w  i  !  5 
attributes  beyond  those  for  the  basic  terms  were  gi.-en  f  -  • 
te  rms . 

Rosch  (1973,  1975)  also  has  argued  that  livin’ 
represented  as  prototypes,  rather  than  is  s.-*  a  •  • 

may  be  thought  of  as  a  kind  of  schema  fir  r- 
category,  which  is  activated  more  re  id  1 1  v  • 
by  atypical  ones.  For  example ,  in  ► 
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canaries  are  judged  more  typical  than  penguins  or  peacocks,  and  in  the 
category  of  tools,  hammers  and  saws  are  judged  more  typical  than  anvils  or 
scissors.  Rosch  (1975)  found  that  there  is  very  strong  agreement  among 
subjects  in  ratings  of  typicality.  Evidence  that  typicality  influences 
cognitive  processes  has  been  obtained  when  subjects  are  asked  to  judge 
whether  statements  such  as  "A  robin  is  a  bird'1  or  "An  anvil  is  a  tool”  are 
true.  In  these  experiments,  judgments  are  made  more  quickly  for  the 
statements  involving  more  typical  examples  (Rosch,  1973;  Rips,  Shoben  & 
Smith,  1973). 

Acquisition  of  prototypical  concepts  has  been  studied  experimentally 
by  Posner,  Goldsmith  and  Welton  (1967),  Franks  and  Bransford  (1971),  and 
Reed  (1972),  among  others.  For  these  experiments,  a  set  of  stimuli  is 
constructed  by  varying  a  single  stimulus,  the  prototype.  The  stimuli  may 
be  geometric  forms,  patterns  of  dots,  schematic  faces,  or  other  kinds  of 
stimuli.  The  stimuli  are  shown  to  subjects,  and  then  a  recognition  test  is 
given.  Subjects'  confidence  in  recognition  is  a  function  of  the  similarity 
of  stimuli  to  the  prototype.  When  the  prototype  itself  is  shown,  subjects 
respond  positively  with  strong  confidence,  even  if  the  prototype  was  not 
included  in  the  set  of  stimuli  they  saw.  Several  investigators  have  shown 
that  this  performance  can  be  explained  by  considering  the  frequencies  with 
which  various  stimulus  features  occur  during  the  learning  trials;  for 
example,  the  features  of  the  prototype  appear  with  high  frequency,  even  if 
the  prototype  itself  is  not  presented  (Reitman  &  Bower,  1973;  Neumann, 
1974). 

A  model  that  simulates  bottora-up  acquisition  of  a  prototypical  concept 
has  been  formulated  by  Anderson,  Kline  &  Beasley  (1979),  using  general 
principles  of  learning  in  the  context  of  a  production-system  model  of 
performance.  Anderson  et  al.'s  system  stores  cognitive  representations  of 
the  patterns  seen  In  individual  stimuli,  and  additional  representations  are 
stored  by  processes  of  generalization  and  discrimination.  Representations 
are  strengthened  when  they  provide  a  basis  for  recognizing  stimuli  that  are 
presented.  Anderson  et  al.'s  simulation  accurately  mimics  subjects' 
performance  on  recognition  tests,  including  false  recognition  of  prototypes 
that  have  not  been  presented  during  learning. 

A  reasonable  expectation  is  that  many  learning  processes  are  not 
strictly  top-down  or  bottom-up,  but  a  combination  of  the  two.  Such  a 
combination  was  analyzed  by  Greeno  and  Scandura  (1966)  and  by  Poison  (1972) 
in  studies  of  concept  induction  involving  verbal  items.  In  the 
experimental  setup,  like  that  used  by  Hull  (1920),  lists  of  paired 
associates  are  presented  to  be  memorized,  and  in  successive  lists  the  same 
response  term  is  paired  with  different  stimuli  that  are  related  to  one 
another.  Greeno  and  Scandura  found  that  transfer  to  individual  items 
occurred  in  an  all-or-none  manner;  different  sets  of  items  had  differing 
proportions  of  items  with  no  errors,  but  for  items  with  any  errors 
performance  in  the  transfer  conditions  could  not  be  distinguished  from  each 
other  or  from  performance  on  control  items.  The  finding  of  all-or-none 
transfer  suggests  a  top-down  conceptual  process  In  which  any  individual 
item  either  is  or  is  not  recognized  as  a  member  of  a  definite  category. 
Poison  (1972)  studied  acquisition  of  the  conceptual  categories  and  found 
that  this  was  not  an  all-or-none  process.  The  findings  were  consistent 
with  a  hypothesis  of  a  two-stage  process.  For  some  subjects,  there  is  an 
initial  stage  of  bottora-up  learning,  In  which  associations  of  responses 
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with  patterns  of  features  are  stored,  with  transfer  depending  on  features 
that  are  shared  by  similar  items.  In  the  initial  phase,  the  subject  may 
notice  the  shared  features  of  members  of  a  concept  category  by  chance. 

Once  the  shared  feature  of  a  category  is  recognized,  the  second  stage  of 
learning  occurs  involving  an  active,  top-down  process  in  which  the  subject 
actively  searches  for  features  to  use  in  classifying  the  stimuli. 

It  is  likely  that  both  the  top-down  and  the  bottom-up  methods  of 
learning  about  categories  are  available  to  human  learners,  and  the  question 
arises  as  to  what  circumstances  make  it  more  likely  for  one  rather  than  the 
other  to  be  used.  Brooks  (1978)  compared  a  condition  in  which  subjects 
were  asked  to  learn  names  for  individual  stimuli  with  a  condition  in  which 
subjects  induced  a  rule  for  classifying  stimuli.  Explicit  rule  induction 
led  to  better  knowledge  of  relevant  features,  reflected  in  better 
performance  on  classification  of  new  stimuli,  as  would  be  expected  from 
learning  by  top-down  induction.  Subjects  who  learned  individual  names  gave 
superior  performance  in  recognition  of  specific  stimuli  from  the  learning 
set,  but  also  recognized  new  stimuli  at  an  above-chance  level,  as  would  be 
expected  from  bottom-up  acquisition  of  a  concept  involving  a  summation  of 
instances . 

IV.B.  Sequential  Concepts . 

We  now  discuss  two  tasks  involving  induction  of  concepts  that  are  more 
complex  than  those  discussed  in  Section  IV. A.  In  the  tasks  we  discuss  now 
materials  are  sequences  of  elements  that  are  organized  in  patterns.  The 
subject's  task  is  to  induce  the  patterns.  First,  we  discuss  the  task  of 
extrapolating  sequences  of  letters,  where  a  subject  must  identify  patterns 
in  the  sequences  that  are  presented  and  use  the  patterns  to  extend  the 
sequences.  The  second  task  is  induction  of  grammatical  rules  of  a  language 
from  example  sentences  that  are  consistent  with  the  grammar. 

In  these  tasks,  the  problem  space  includes  a  set  of  stimuli  and  a 
space  of  possible  structures,  as  in  all  induction  problems.  However,  in 
comparison  to  the  space  of  possible  rules  for  classifying  stimuli,  the 
space  of  possible  pattern  descriptions  for  sequences  and  the  space  of 
possible  grammatical  rules  are  extremely  large.  To  solve  these  problems, 
substantial  reductions  of  the  search  spaces  are  required,  ind  these  are 
accomplished  by  constraints  on  the  generation  of  hypotheses.  In  sequence 
extrapolation,  a  limited  set  of  relations  and  sequence  forms  are 
considered.  In  the  analysis  of  grammar  induction  that  we  discuss, 
hypotheses  about  the  structures  of  sentences  are  constrained  by  the 
structures  of  situations  that  the  sentences  describe. 

IV.B.l.  Sequence  Extrapolation.  An  example  of  a  sequence 

extrapolation  problem  is  the  following:  mabmbcmcdm  _ ,  where  the  task  is 

to  extend  the  sequence.  In  a  model  of  sequence  extrapolation  formulated  by 
Simon  and  Kotovsky  (1963),  a  pattern  is  induced  from  basic  relations 
between  the  letters  in  the  problem  string.  The  pattern  is  a  kind  of 
formula  for  producing  the  sequence;  once  discovered,  the  formula  can  be 
used  to  extend  the  sequence,  as  required. 
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For  example,  for  the  problem  mabmbcmcdm  _ ,  the  formula  that  is 

induced  is  the  following:  [sl:m;  s2:a]  ,  [si ,  s2,  (N(s2)),  s 2 ] .  The  first 
part  of  the  formula  is  initialization.  There  are  two  subsequences,  denoted 
si  and  s2.  Si  starts  with  m,  and  s2  starts  with  a.  The  second  part  of  the 
formula  gives  instructions  for  producing  the  sequence.  The  instructions 
are  interpreted  as  follows:  si:  write  the  current  symbol  of  si;  s2: 
write  the  current  symbol  of  s2;  (N(s2)):  change  the  symbol  in  s2  to  the 

successor  (N  for  next)  of  the  current  symbol;  finally,  s2:  write  the 
(new)  current  symbol  of  s2.  The  entire  sequence  is  generated  by  repeating 
this  routine  as  many  times  as  necessary. 

The  problem  solver  constructs  a  formula  as  a  hypothesis,  based  on  the 
first  letters  of  the  given  sequence,  and  tests  the  hypothesis  with  more  of 
the  letters.  There  are  many  different  ways  to  form  a  sequence  of  letters, 
so  in  principle,  the  number  of  possible  formulas  is  extremely  large.  To 
make  the  task  manageable,  some  constraints  have  to  be  imposed.  In  Simon 
and  Kotovsky's  (1963)  model,  constraints  are  imposed  on  the  generation  of 
hypotheses.  As  in  the  focussing  strategies  that  Bruner  et  al.  (1956) 
identified  for  concept  induction,  hypotheses  about  the  structure  of  a 
pattern  are  based  on  features  of  the  stimulus,  rather  than  being  generated 
a  priori.  Furthermore,  only  a  limited  set  of  the  possible  hypotheses  are 
ever  generated,  because  the  model  only  considers  a  small  set  of  relations 
between  elements  and  it  is  assumed  that  the  sequence  fits  a  specific  form. 

The  model  knows  the  alphabet  of  letters,  both  forward  and  backward. 

The  relations  that  are  recognized  are  identity  and  successor,  I  and  N.  The 
problem  solver  assumes  that  the  sequence  is  periodic,  an  important 
structural  characteristic. 

The  model  begins  by  determining  the  period  of  the  sequence. 

Periodicity  can  be  discovered  either  by  noting  that  a  relation  is  repeated 
every  nth  symbol,  or  noting  that  a  relation  is  interrupted  at  every  nth 

position.  In  the  problem  mabmbcmcdm  _  the  periodicity  is  identified  by 

noting  that  the  relation  I  occurs  between  every  third  symbol,  the  m's. 

Then  the  problem  solver  produces  a  description  of  the  symbols  that  occur 
within  the  periods  and  relations  between  correspondinging  symbols  in 

successive  periods.  For  mabmbcmcdm  _  the  description  requires  two 

subsequences,  one  of  which  is  just  repetition  of  m;  the  other  starts  with 
£  and  is  incremented  to  produce  the  final  term  within  the  set  of  three 
symbols.  The  result  of  the  process  is  a  formula  for  producing  the 
sequence,  such  as  the  one  described  earlier  for  the  example  problem. 

Because  the  product  of  the  inductive  process  is  an  explicit  formula, 
sequence  extrapolation  can  be  considered  as  a  problem  of  design  as  well  as 
a  problem  of  induction.  Viewed  in  this  way,  the  problem  solver  has 
available  a  set  of  symbols  —  si,  92,  s3,  (perhaps  more),  N,  and  the 
letters  of  the  alphabet  —  and  has  the  task  of  constructing  from  these 
symbols.  The  feature  of  sequence  extrapolation  that  makes  it  an  inductive 
task  is  the  criterion  that  the  construction  must  satisfy.  The  criterion  is 
that  the  formula  should  produce  the  sequence  of  letters  that  is  given  in 
the  problem.  In  ordinary  problems  of  design,  9uch  as  anagram  or 
cryptarithmetic,  the  criterion  is  a  general  property  rather  than  agreement 
with  an  arrangement  of  stimuli. 
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Simon  and  Kotovsky  (1963)  reported  data  on  difficulty  of  solving  15 
different  sequence-extrapolation  problems  by  two  groups  of  human  subjects 
and  found  fair  agreement  between  the  relative  difficulty  of  problems  for 
human  solvers  and  for  their  program.  A  more  thorough  empirical  study  was 
conducted  by  Kotovsky  and  Simon  (1973),  who  collected  thinking-aloud 
protocols  on  problems  with  sequences  under  presented  under  panels  that 
subjects  lifted  to  see  individual  letters.  The  data  were  consistent  with 
the  model  in  important  respects.  Subjects,  like  the  model,  determined  the 
periodicity  of  sequences  and  looked  for  relations  between  successive 
elements  or  between  elements  separated  by  a  regular  period. 

Representations  of  sequences  induced  by  the  subjects  agreed  with  those  that 
the  model  induced  in  a  majority  of  cases. 

There  also  were  discrepancies,  some  of  which  involved  relatively  minor 
details  of  programming,  but  two  of  which  revealed  significant  processes  in 
humans  not  represented  in  the  model.  First,  there  was  a  closer  integration 
in  the  subjects'  performance  than  in  the  program's  between  discovery  of  the 
period  of  the  sequence  and  induction  of  the  pattern  description.  These  are 
distinct  phases  in  the  model,  whereas  the  human  problem  solvers  used 
information  in  form!  ?  the  pattern  description  that  they  had  picked  up 
during  the  phase  of  finding  the  period.  Another  discrepancy  between  human 
data  and  Simon  and  Kotovsky' s  simplest  model  was  that  in  some  problems, 
human  solvers  induced  patterns  with  hierarchical  structure,  involving  a 
single  low-level  description  and  a  higher-level  switch  that  transited 
between  versions  of  the  low-level  structure.  Hierarchical  relation  between 
levels  of  pattern  description  is  a  basic  structural  feature  of  sequential 
patterns  that  can  play  a  dominant  role  in  the  induction  process,  as  shown 
by  Restle  (1970). 

IV. B. 2.  Grammatical  Rules.  Next,  we  consider  induction  of  the 
grammar  of  a  language.  We  discuss  aspects  of  language  acquisition  that 
relate  directly  to  general  issues  in  the  theory  of  problem  solving.  Thus, 
our  discussion  is  selective,  and  does  not  fully  represent  the  rich 
literature  on  processes  of  language  acquisiton,  which  deals  with  a  much 
broader  range  of  issues  than  we  consider  here. 

In  acquiring  the  grammar  of  a  language,  the  materials  presented  to  a 
learner  include  sentences  of  the  language.  The  task  is  to  infer  a  set  of 
rules  that  can  be  used  to  parse  sentences  that  are  heard  and  produce 
sentences  that  are  grammatical  in  the  language.  Thus,  problem  solving 
Involves  search  in  a  space  of  possible  syntactic  rules.  The  space  of 
stimuli  includes  the  grammatical  sentences  that  the  learner  hears,  and  the 
task  is  to  induce  the  rules  that  characterize  the  structure  of  those 
sentences. 

Human  knowledge  of  the  rules  of  grammar  is  implicit ,  in  contrast  to 
the  explicit  formulas  that  are  induced  in  the  sequence  extrapolation  task. 
This  is  seen  in  the  facts  that  very  young  children  have  significant 
knowledge  of  grammar  (e.g.,  Brown,  1973),  and  that  adults  know  grammatical 
rules  explicitly  only  if  they  have  had  special  training.  Because  of  the 
implicit  nature  of  grammatical  knowledge,  the  product  of  language  learning 
is  characterized  as  a  set  of  procedures ,  rather  than  explicit  formulas  or 
other  descriptions  of  structure.  The  procedures  acquired  by  learners  of  a 
language  enable  the*.,  to  produce  and  understand  sentences  that  agree  with 
the  grammar  of  the  language  and  to  distinguish  between  grammatical  and 
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ungrammatical  sequences  of  words.  We  refer  to  such  a  set  of  procedures  as 
knowledge  of  the  grammatical  rules,  because  the  rules  are  built  into  the 
procedures.  As  with  much  procedural  knowledge,  an  individual's  knowledge 
of  the  rules  in  the  form  of  procedures  does  not  imply  that  he  or  she  can 
state  what  the  rules  are. 

There  is  evidence  from  both  empirical  studies  (e.g.,  Moeser  &  Bregman, 
1972)  and  theoretical  analyses  (e.g.,  Wexler  &  Cullicover,  1980)  that 
grammatical  rules  are  learned  more  easily  if  reference  is  provided  for 
terms  in  the  language.  This  indicates  that  in  the  space  of  stimuli  for 
inducing  a  grammar,  each  sentence  is  paired  with  a  situation  that  the 
sentence  describes.  The  functions  of  situations  in  facilitating  induction 
of  grammatical  rules  probably  include  assisting  in  determining  which  words 
belong  together  in  constituent  units  (cf.  Morgan  &  Newport,  1981). 

We  discuss  an  analysis  of  language  acquisition  by  Anderson  (1975, 
1977),  as  an  example  that  describes  a  definite  information-processing 
mechanism  for  acquiring  knowledge  of  grammatical  rules  in  the  form  of 
procedures.  Anderson's  system  includes  learning  processes  that  show  how 
semantic  reference  can  facilitate  the  acquisition  of  grammar.  Anderson's 
learning  system,  called  LAS  for  Language  Acquisition  System,  induces  rules 
of  grammar  when  it  is  given  sentences  in  a  language  accompanied  by  the 
semantic  objects  that  the  sentences  are  about.  For  example,  if  the 
sentence,  "The  red  square  is  above  the  small  circle"  is  presented  to  LAS, 
there  also  is  a  semantic  network  that  represents  an  object  with  the 
properties  red  and  square ,  another  object  with  the  properties  small  and 
circle,  and  the  relation  above  between  the  two  objects. 

LAS  has  a  procedure,  used  in  its  learning  of  grammar,  that  identifies 
the  objects  in  the  semantic  network  that  correspond  to  words  in  the 
sentence  and  forms  a  structure  showing  the  relations  among  those  concepts. 
This  structure  Is  used  to  determine  constituent  units  of  the  sentence.  In 
the  example  sentence,  red  and  square  are  bracketed  together,  because  they 
are  properties  of  the  same  object,  as  are  small  and  circle .  The  relational 
term  above  is  at  a  higher  level  in  the  bracketing  that  LAS  forms.  The 
procedures  that  LAS  acquires  include  rules  for  parsing  noun  phrases  such  as 
the  red  square  and  the  small  circle ,  and  sentences  of  the  form  NP  Relation 
NP.  LAS  also  has  a  mechanism  for  generalization,  so  that  similar 
structures  eventually  come  to  be  parsed  by  LAS  with  a  single  rule,  and  some 
of  these  generalizations  produce  recursive  parsing  rules.  The 
generalization  process  sometimes  produces  incorrect  rules  that  are  too 
general,  and  LAS  also  includes  a  mechanism  of  discrimination  that  restricts 
the  application  of  its  language-processing  procedures. 

Viewed  as  a  problem-solving  system,  LAS  conducts  search  in  a  space  of 
procedures  for  producing  and  understanding  sentences.  (Note  that  we  can 
also  view  LAS  as  designing  or  constructing  these  procedures.)  LAS's  use  of 
the  structure  of  situations  provides  significant  constraints  that  are 
needed  for  the  search.  As  in  Simon  and  Kotovsky's  (1963)  model  of  sequence 
extrapolation,  the  constraints  are  applied  to  the  generation  of  hypotheses. 
Processes  for  modifying  the  induced  procedures  are  available;  the  system 
can  generalize  Its  procedures,  which  makes  its  performance  more  efficient, 
and  it  can  add  restrictions  to  the  application  of  procedures  when  it  is 
informed  that  use  of  a  procedure  has  produced  an  error. 


Section  IV,  Induction 
Nonsequential  Patterns 


Page  97 


IV. C.  Nonsequential  Patterns 

Now  we  discuss  induction  of  patterns  that  are  not  sequential  in 
character.  We  begin  with  a  simple  case.  We  discuss  analogy  problems  in 
which  one  or  two  pairs  of  items  are  presented,  related  in  some  way.  The 
task  is  to  form  another  pair  with  the  same  relation.  There  have  been 
extensive  empirical  and  theoretical  analyses  of  processes  of  solving 
analogy  problems.  We  then  discuss  more  complicated  cases,  involving 
induction  of  concepts  in  mathematics  and  of  quantitative  regularities  and 
structures  in  scientific  domains,  for  which  the  available  analyses  are 
primarily  theoretical. 

IV.C.l.  Analogy  Problems.  The  form  of  an  analogy  problem  is 
A:B::C:D,  where  D  is  often  a  set  of  alternative  items  that  can  complete  the 
analogy,  with  the  subject  required  to  choose  one  from  the  set.  A  and  B  are 
related  in  some  way,  and  the  correct  choice  is  a  D  item  with  the  same 
relation  to  C  as  B  has  to  A.  Solution  of  an  analogy  problem  involves 
search  in  a  space  of  relations  for  one  that  can  be  applied  to  both  the  A:B 
and  the  C:D  pairs,  or  to  one  of  the  C: Di  alternatives  more  successfully 
than  any  of  the  others.  Analogy  problems  are  used  commonly  in  tests  of 
intellectual  ability.  In  factor-analytic  studies,  analogy  problems 
contribute  most  to  the  factor  of  induction,  the  single  best  predictor  of 
academic  achievement  (Snow,  1980). 

Solution  of  analogy  problems  requires  (1)  a  process  for  recognizing  or 
analyzing  relations  between  pairs  of  stimuli,  that  is,  between  the  A  and  B 
stimuli  and  between  C  and  each  of  the  Di  alternatives;  and  (2)  a  process 
that  compares  relations  found  for  the  A:B  pair  with  relations  found  for  the 
various  C:Di  alternatives  and  chooses  the  C:Di  relation  that  matches  best 
with  an  A:B  relation.  In  the  simplest  case,  the  relation  for  A  and  B  that 
comes  to  mind  first  also  applies  to  one  and  only  one  of  the  C:Di  pairs. 

When  this  does  not  occur,  because  relations  :B  found  for  A:B  apply  either 
to  more  than  one  C:Di  pair  or  to  none  of  them,  some  further  analysis  of  the 
A:B  pair  is  required.  In  such  cases,  A:B  relations  can  be  suggested  by 
relations  that  are  found  in  considering  the  C:Di  pairs. 

We  discuss  two  processes  for  solving  analogy  problems.  In  one 
process,  relations  between  pairs  of  items  are  based  on  information  stored 
in  the  problem-solver's  memory.  Memory-based  analogy  problems  include  most 
verbal  analogies,  where  solutions  use  relations  between  words  that  are 
stored  in  memory  or  are  inferred  from  word  meanings.  In  the  other  process, 
relations  are  determined  by  analysis  of  features  of  stimuli.  When  analogy 
problems  are  composed  of  geometric  diagrams,  relations  between  pairs  of 
terms  are  found  by  comparing  pairs  of  diagrams  and  identifying  differences 
between  the  members  of  each  pair. 

Relations  Based  on  Semantic  Memory.  Solutions  to  many  verbal 
analogies  are  obtained  by  finding  a  relation  between  the  A  and  B  words 
based  on  their  meanings  stored  in  semantic  memory,  and  then  finding  a 
similar  relation  between  C  and  one  of  the  Di  pairs.  Reitman  (1965) 
formulated  a  model  for  verbal  analogies  based  on  activation  of  concepts  in 
a  semantic  network.  Reitman' s  model,  called  Argus,  solves  problems  such  as 
bear  :  pig  ::  chair  :  (foot ,  table ,  coffee ,  strawberry) .  Argus  has 
knowledge  of  words  in  a  network  of  relational  connections;  for  example 
bear  and  pig  are  both  connected  to  animal  through  the  relation 
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superordinate .  Activation  and  inhibition  are  transmitted  through 
connections  between  units. 

Argus  can  perform  according  to  different  strategies.  In  one  strategy, 
the  A  and  B  terms  are  activated,  and  relations  that  become  made  active  are 
noted;  then  C  is  active,  and  the  Di  alternatives  are  activated  in  turn.  A 
goal  is  set  for  relations  that  are  the  same  as  the  ones  activated  by  the 
A:B  pair.  When  a  C:Di  pair  activates  those  relations,  that  Di  alternative 
is  chosen.  In  the  example  problem,  after  bear  and  pig  are  activated,  their 
superordinate  relations  to  animal  become  active,  because  these  lie  on  a 
path  between  the  activated  terms.  Then  chair  is  activated  along  with  the 
Di  alternatives  in  turn,  with  the  goal  of  finding  active  superordinate 
relations.  This  goal  is  achieved  when  table  is  activated,  because  both 
chair  and  table  are  connected  by  superordinate  relations  to  furniture. 

Strategic  factors  in  analogy  problems  were  demonstrated  in  an 
experiment  by  Grudin  (1930).  Grudin  presented  two  kinds  of  analogy  items: 
one,  called  standard  items,  where  a  salient  relationship  between  A  and  B 
can  be  matched  with  one  of  the  C:Di  pairs;  and  the  other,  called 
nonstandard  items,  where  there  is  no  salient  relation  between  A  and  B,  but 
a  relation  between  A  and  C  matches  one  between  B  and  a  Di  alternative.  An 
example  is  the  item  bird : air : : fish: (breath,  water,  swim)  in  standard  form, 
which  in  its  nonstandard  version  is  bird:fish: :air:(breathe,  water,  swim) . 
The  nonstandard  problems  are  more  difficult,  as  measured  by  the  time 
required  for  a  solution.  However,  if  subjects  can  adapt  their  strategies 
to  look  for  relations  between  A:C  and  B:Di  pairs,  the  difficulty  of 
nonstandard  problems  might  be  reduced.  Grudin's  sequence  of  problems 
included  five-item  sets  that  were  either  all  standard  or  all  nonstandard, 
followed  by  either  a  standard  or  a  nonstandard  problem.  During  solution  of 
a  set  of  nonstandard  items,  a  shift  in  strategy  could  occur,  involving  more 
attention  to  the  A:C  and  B:Di  pairs.  This  would  produce  shorter  times  for 
nonstandard  problems  following  nonstandard  sets  than  for  nonstandard 
problems  following  standard  sets,  and  this  result  was  obtained. 

Thinking-aloud  protocols  in  solution  of  verbal  analogies  were  obtained 
in  a  study  by  Heller  (1979;  also  described  by  Pellegrino  &  Glaser,  1982). 
Heller  first  presented  the  three  terms  of  an  analogy  stem  and  asked  the 
subject  to  think  aloud,  including  a  statement  of  any  A:B  relations  and 
expectations  about  the  answer  that  came  to  mind.  Then  four  alternative 
answers  were  presented  individually  with  the  subject  asked  to  judge  whether 
each  alternative  was  an  acceptable  answer,  and  why,  and  finally  the 
complete  problem  was  presented  for  a  final  choice. 

Heller's  findings  were  consistent  with  the  general  features  of 
Reitman's  (1965)  hypotheses  of  solution  strategies  and  of  finding  relations 
by  activation  of  a  semantic  network.  Strategic  factors  provide  an 
interpretation  of  individual  differences  in  performance,  and  the  activation 
hypothesis  is  supported  by  a  finding  of  variability  in  solution  sequences. 

Heller's  major  finding  was  a  striking  difference  between  groups  of 
subjects  in  their  adherence  to  the  task  constraints  of  analogy  problems. 

The  main  constraint  of  an  analogy  is  that  the  relationships  between  A:B  and 
C:Di  should  correspond.  If  a  subject  chooses  a  Di  response  on  the  basis  of 
a  relation  to  C  without  regard  to  the  correspondence  of  that  relation  to 
the  A:B  relation,  then  the  analogy  constraint  has  not  been  applied. 
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Subjects  who  had  good  overall  performance  mentioned  the  similarity  or 
difference  between  an  A:B  relation  and  at  least  one  of  the  C:Di  relations 
on  nearly  all  problems.  In  contrast,  subjects  with  poorer  overall 
performance  were  inconsistent  in  applying  the  constraint  of  a  matching  A:B 
and  C:Di  relation,  frequently  accepting  answers  based  on  a  relation  between 
Di  and  C,  or  with  other  terms  in  the  analogy  of  a  quite  diffuse  kind.  To 
account  for  the  differences  among  subjects  in  adherence  to  the  task 
constraints,  Heller  proposed  that  individuals  differ  in  the  strengths  of 
goals  that  are  related  to  general  solution  strategies.  In  Reitman's  model, 
this  would  correspond  to  the  better  subjects'  having  strategic  goals  that 
were  activated  more  strongly,  o,r  to  differences  in  the  degree  to  which 
goals  became  inactive  or  were  interfered  with  by  other  processes. 

Heller's  protocols  also  revealed  considerable  variability  in  the 
sequence  of  steps  in  solving  the  problems.  In  a  majority  of  cases, 
subjects  identified  an  A:B  relation  and  then  thought  about  C:Di 
alternatives  in  the  context  of  that  relation.  There  also  were  cases  in 
which  a  relation  between  A  and  B  came  to  mind  as  a  subject  thought  about 
one  or  more  of  the  C:Di  relations.  Such  solution  sequences  occurred  in 
about  20%  of  the  problems  for  which  subjects  adhered  to  the  analogical 
constraints.  Reitman's  assumption  that  relations  are  found  by  activation 
of  a  semantic  network  provides  an  interpretation  of  the  variability  of 
solution  sequences,  since  activation  of  a  relation  in  the  context  of  a  C.‘Di 
pair  would  facilitate  its  recognition  for  A:B  in  some  cases  where  A:B  did 
not  elicit  it. 

Further  information  relevant  to  individual  differences  was  obtained  in 
a  study  by  Pellegrino  and  Glaser  (1982).  Analogy  items  with  single  D 
alternatives  were  presented  and  subjects  judged  the  items  as  true  or  false. 
Pellegrino  and  Glaser  used  an  experimental  and  statistical  method 
introduced  by  Sternberg  (1977),  in  which  the  four  terms  are  presented  in 
sequence.,  with  the  subject  making  a  response  to  request  the  addition  of 
each  term.  The  latencies  of  these  responses  are  used  to  estimate  the  time 
for  various  component  of  the  solution  process,  according  to  a  general 
model.  Each  latency  includes  time  to  encode  the  new  item.  When  B  is 
presented,  the  latency  includes  time  to  infer  one  or  more  relations  between 
A  and  B.  When  C  is  presented,  the  latency  includes  time  to  map  A:B 
relations  onto  the  C  term.  When  D  is  presented,  C:D  relations  are  inferred 
and  compared  with  the  A:B  relations.  It  was  assumed  that  the  comparison 
process  could  have  three  outcomes.  The  relations  could  correspond  well, 
leading  to  a  "true"  response.  The  lack  of  correspondence  could  be  so  great 
that  the  subject  would  immediately  reject  the  analogy  and  give  a  "false" 
response.  The  subject  could  judge  that  the  correspondence  was 
indeterminate  and  engage  in  a  more  extended  analysis,  possibly  including 
review  of  the  A  and  B  terms  to  find  new  relations. 

Four  sets  of  items  were  used  in  the  study.  There  were  positive  items, 
which  were  judged  to  be  appropriate  analogies,  and  negative  items,  which 
were  judged  to  be  inappropriate.  Within  each  of  these  sets,  there  were 
items  in  which  the  C  and  D  terms  were  strong  associates  and  other  items  in 
which  the  C  and  D  terms  were  not  associates.  A  weak  C-D  association  for  a 
positive  item,  or  a  strong  C-D  association  for  a  negative  item,  was 
expected  to  make  the  item  more  ambiguous  and  increase  the  frequency  of 
extended  analyses  in  the  final  component  of  the  solution  process.  The 
results  supported  this  expectation;  estimates  of  the  proportion  of 
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problems  with  extended  analyses  were  higher  for  weakly  associated  positive 
items  (.55)  than  for  strongly  associated  positive  items  (.23),  and  for 
strongly  associated  negative  items  (.19)  than  for  weakly  associated 
negative  items  (.07).  A  similar  correlation  of  item  difficulty  with  time 
spent  in  the  final  stage  of  solution  was  obtained  by  Barnes  and  Whitely 
(1981). 

Pellegrino  and  Glaser's  major  finding  was  that  the  frequencies  of  an 
extended  analysis  were  correlated  with  the  subjects'  overall  ability  in  the 
analogies  task.  The  subjects  were  college  students  divided  into  groups 
with  relatively  high  and  relatively  low  scores  on  a  standard  analogies 
test.  The  estimates  of  time  for  the  various  information-processing 
components  were  generally  longer  for  the  low-ability  subjects.  But  the 
most  striking  difference  was  in  the  frequency  of  engaging  in  an  extended 
analysis,  which  was  more  than  twice  as  high  for  the  low-ability  than  for 
the  high-ability  subjects.  Pellegrino  and  Glaser  concluded  that  the 
low-ability  subjects  more  frequently  arrived  at  the  final  stage  of 
processing  an  analogy  with  an  inadequate  representation  of  the  relations 
among  the  other  terms,  and  therefore  had  to  reconsider  the  A,  B,  and  C 
terms  more  frequently.  (A  similar  difference  in  the  solution  process  was 
found  by  Snow,  1980,  in  spatial  reasoning  tasks  in  which  the  items  are 
diagrams  and  reexaminations  of  terms  can  be  observed  by  recording  eye 
movements.)  In  verbal  analogies,  this  difference  in  processing  could  be  due 
to  differences  in  the  information  in  semantic  memory,  differences  in  the 
activation  process,  or  differences  in  strategy  with  low-ability  subjects 
more  likely  to  want  to  see  the  final  term  to  facilitate  recognition  of  A:B 
relations.  This  conclusion  is  consistent  with  Heller's  finding  that 
students  with  lower  ability  in  analogies  often  choose  responses  that 
violate  the  constraint  of  an  analogy  problem.  They  frequently  lack  a 
response  that  satisfies  the  constraints,  and  are  likely  to  choose  a 
response  on  some  other  basis. 

In  Reitman's  (1965)  model  of  verbal  analogy  solution,  relations  are 
relatively  discrete  components  of  semantic  memory.  This  characterization 
probably  is  correct  for  most  verbal  analogies,  but  there  are  cases  in  which 
it  does  not  apply.  An  example  was  studied  by  Rumelhart  and  Abrahamson 
(1973),  who  studied  solution  of  verbal  analogy  problems  in  a  single 
semantic  domain,  the  names  of  animals. 

Analogies  composed  of  animal  names  have  two  properties  that  are 
different  from  most  verbal  analogies.  First,  they  depend  on  more  than  one 
relation,  and  the  relations  are  combined  somehow  in  solving  the  problem. 
Second,  the  relations  differ  in  degree,  rather  than  just  being  present  or 
absent. 

An  example  that  illustrates  multiple  relations  is  the  following: 
rabbit :3heep : : beaver : ( tiger .donkey) .  Donkey  seems  the  better  answer, 
perhaps  because  while  a  relationship  involving  size  is  similar  for 
beaver : tiger  and  beaver  Monkey,  and  both  are  similar  to  the  size  relation 
for  rabbi t :sheep,  there  also  is  an  additional  difference  for  beaver : tiger 
—  tigers  are  ferocious  while  beavers  are  not,  and  thus  the  beaver  Monkey 
pair  matches  the  rabbit :sheep  pair  better,  which  also  lacks  a  difference  in 
ferocity.  The  graded  nature  of  relations  is  illustrated  by 
rabbit :beaver: : sheep: (donkey, elephant) .  Donkey  seems  the  better  answer. 

The  judgment  seems  to  depend  mainly  on  the  sizes  of  the  animals,  and 
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beavers  are  larger  than  rabbits,  but  the  difference  seems  not  to  be  large 
enough  to  make  sheep  relephant  seem  appropriate. 

To  represent  differences  of  graded  magnitudes  that  can  be  combined 
easily,  it  is  convenient  to  use  a  spatial  representation.  In  such  a 
representation,  the  dimensions  of  the  space  correspond  to  salient  ways  in 
which  items  differ  from  each  other.  Each  item  is  located  at  a  point  in  the 
space.  The  coordinates  of  the  point  correspond  to  the  values  that  the  item 
has  on  each  of  the  dimensions. 

A  spatial  representation  of  a  set  of  items  can  be  obtained  by 
presenting  pairs  of  the  items  to  subjects  and  asking  them  to  judge  how 
similar  the  members  of  each  pair  are  to  each  other.  These  judgments  of 
similarity  are  used  as  estimates  of  the  distances  between  pairs  of  items, 
and  items  are  located  in  the  space  so  that  the  distances  between  points  are 
as  close  as  possible  to  the  estimates  obtained  in  the  experiment.  In  the 
method  of  choosing  the  spatial  representation,  called  multidimensional 
scaling,  an  attempt  is  made  to  represent  the  items  in  one  dimension;  if 
that  is  unsuccessful  two  dimensions  are  used,  and  so  on  until  a  space  is 
found  with  the  points  located  so  that  interpoint  distances  agree 
satisfactorily  with  the  similarity  judgments  given  by  subjects. 

Henley  (1969)  obtained  judgments  of  similarity  for  pairs  of  animal 
names,  and  obtained  a  spatial  representation  with  three  dimensions:  size, 
ferocity,  and  a  third  dimension  that  probably  involves  a  mixture  of 
attributes,  including  similarity  to  humans.  These  results  were  used  by 
Rumelhart  and  Abrahamson  (1973)  in  their  study  of  analogy  problem  solving. 
The  relation  between  two  items  A  and  B  corresponds  to  the  vector  that 
connects  the  points  for  A  and  B  in  the  spatial  representation.  The  vector 
represents  the  combination  of  differences  in  the  three  dimensions  between 
the  two  items;  for  example,  the  vector  from  beaver  to  tiger  represents  a 
moderate  increase  in  ferocity,  a  large  increase  in  size,  and  very  little 
difference  in  "humanness."  In  Rumelhart  and  Abrahamson' s  model,  to  solve  an 
analogy,  A:B: :C:(D1 ,D2,D3,D4) ,  the  A — >B  vector  is  translated  to  C,  and  the 
probability  of  choosing  each  of  the  Di  alternatives  is  a  function  of  its 
distance  from  the  ideal  point  defined  by  the  end  of  the  vector.  In  one 
experiment,  the  model  provided  accurate  predictions  of  the  frequencies  of 
subjects'  rankings  of  response  alternatives  in  analogy  problems.  In 
another  experiment,  fictitious  animal  names  were  locations  in  the  spatial 
representation.  These  fictitious  names  were  used  in  analogy  problems  for 
which  subjects  received  feedback,  and  the  subjects  induced  features  of  the 
fictitious  animals,  responding  appropriately  to  new  analogies  involving 
their  names. 


Figure  12  here 


Relations  Based  on  Feature  Analysis.  In  a  geometric  analogy  problem, 
the  terms  are  diagrams  that  differ  in  various  ways.  An  example  is  in 
Figure  12.  The  best  answer  apparently  is  D2.  A  and  B  are  related  by 
deletion  of  the  dot  and  moving  the  rectangle  from  inside  the  triangle  to 
the  left  of  the  triangle.  C  and  D2  are  related  in  a  similar  way:  the  dot 
in  C  also  is  deleted,  and  the  Z  is  moved  from  inside  the  segment  of  the 
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circle  to  the  left  of  the  segment. 

As  Figure  12  illustrates,  the  relation  between  a  pair  of  diagrams  can 
have  several  parts,  corresponding  to  components  of  the  diagrams  that 
differ.  Some  of  the  differences  can  be  quantitative,  for  example,  the 
amount  of  rotation  of  a  component  or  the  amount  by  which  the  size  of  a 
component  is  increased  or  decreased.  In  the  domain  of  analogy  problems 
involving  animal  names,  these  characteristics  of  composite  and  quantitative 
relations  make  a  spatial  representation  of  items  a  reasonable  one.  A 
spatial  representation  is  not  economical  for  geometric  analogies,  because 
there  are  too  many  ways  in  which  diagrams  can  differ.  In  the  domain  of 
animal  names,  a  satisfactory  approximation  is  that  all  pairwise  relations 
can  be  characterized  by  differences  on  three  dimensions,  but  the  domain  of 
geometric  diagrams  does  not  have  such  a  simple  structure. 

In  geometric  analogies,  relations  are  found  by  examining  features  of 
the  diagrams,  rather  than  by  retrieving  information  from  memory,  as  with 
verbal  analogies.  A  model  of  solving  geometric  analogy  problems, 
therefore,  has  two  components:  one  component  that  analyzes  diagrams  and 
identifies  relations  between  them,  and  another  component  that  compares  the 
relation  of  A:B  with  relations  of  the  C:Di  alternatives  and  chooses  the 
best  match. 

Evans  (1968)  developed  a  model  that  solves  geometric  analogy  problems. 
The  program  is  given  descriptions  of  the  diagrams  that  specify  the 
locations  of  straight  lines,  curved  lir.  s,  and  closed  figures.  From  these 
descriptions,  relations  among  components  are  derived;  for  example,  that 
one  figural  component  is  inside  another,  or  is  above  it  in  the  diagram. 

The  model  then  compares  its  representations  of  the  diagrams  in  pairs 
and  forms  descriptions  of  relations  between  the  members  of  the  pairs. 

These  relations  are  in  the  form  of  transformations  —  Chat  is,  changes  in 
one  diagram  that  would  make  it  the  same  as  the  other  diagram  in  the  pair. 
For  example,  a  component  in  one  diagram  might  be  removed,  or  a  component 
might  be  added  or  a  component  might  be  changed  in  size  or  rotated,  or  the 
relative  positions  of  two  components  might  be  changed,  say,  by  moving  one 
from  inside  the  other  to  above  the  other. 

The  relation  between  A  and  B  is  then  compared  with  the  relations 
between  C  and  each  of  the  Di  alternatives.  This  comparison  involves 
matching  components  of  A  with  components  of  C  and  determining  which  of  the 
transformations  in  the  A:B  relation  also  occur  in  the  C:Di  transformation. 
The  Di  alternative  is  chosen  for  which  the  greatest  number  of 
transformations  can  be  made  to  correspond. 

Evans  (1968)  developed  his  model  as  a  project  in  artificial 
intelligence,  rather  than  as  a  simulation  of  human  problem  solving,  but 
even  so,  the  model  has  features  that  seem  plausible  as  psychological 
hypotheses.  One  such  feature  is  a  suggestion  that  problems  with  more 
complex  diagrams  or  relations  between  diagrams  should  be  more  difficult  for 
humans  to  solve.  In  the  model,  diagrams  are  more  complex  if  they  have  more 
components,  and  relations  are  more  complex  if  there  are  more 
transformations,  that  is,  if  there  are  more  changes  in  components  between 
diagrams  that  are  related.  These  factors  were  varied  in  an  experiment  by 
Mulholland,  Pellegrino  and  Glaser  (1980),  and  both  had  significant  effects. 
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Problems  that  had  diagrams  with  more  components  and  problems  with  more 
transformations  required  longer  times  for  solution. 

In  human  solution  of  geometric  analogy  problems,  we  should  expect  some 
of  the  same  characteristics  of  performance  that  have  been  observed  in 
solution  of  other  analogy  problems.  An  important  factor  in  verbal  analogy 
problems,  discussed  above,  is  the  processing  required  when  the  subject's 
representation  of  the  A:B  relation  and  the  C:Di  relations  are  not 
sufficient  to  provide  a  determinate  answer  and  further  processing  is 
needed.  Findings  by  Sternberg  (1977)  show  that  this  factor  is  important  in 
geometric  analogy  problems  as  well.  Sternberg  measured  the  time  to  solve 
problems  presented  after  part  of  the  problem  had  been  shown,  enabling  part 
of  the  processing  to  occur.  He  used  the  differences  between  conditions  as 
estimates  of  the  times  for  components  of  the  solution  process.  In 
comparing  subjects  with  differing  levels  of  general  reasoning  ability, 
Sternberg  found  a  large  difference  in  the  time  required  to  process  the  C:Di 
alternatives  in  geometric  analogy  problems,  with  much  of  the  difference 
attributed  to  a  process  of  comparing  alternatives  when  prior  processing  has 
not  provided  a  unique  solution. 

IV. C. 2.  Inductive  Problems  in  Mathematics  and  Science.  Cognitive 
analyses  have  been  developed  in  the  form  of  computer  programs  that  invent 
new  mathematical  concepts,  based  on  properties  of  examples,  and  that  induce 
formulas  and  structures  from  data  in  scientific  domains.  We  briefly 
discuss  three  models:  one  that  invents  new  mathematical  concepts,  one  that 
induces  formulas  from  sets  of  quantitative  data,  and  one  that  induces 
molecular  structure  from  data  of  mass  spectroscopy. 

Invention  of  Concepts  in  Mathematics.  A  program  called  AM  (Lenat, 
1982)  generates  examples  of  concepts  that  it  knows  and  develops  new 
concepts,  based  on  properties  of  the  examples.  The  main  domain  in  which  AM 
was  run  is  elementary  mathematics.  AM  was  given  initial  concepts  involving 
sets  and  developed  a  variety  of  concepts  involving  numbers.  For  example, 

AM  developed  concepts  of  addition  and  multiplication,  developed  the  concept 
of  primes,  and  arrived  at  a  conjecture  that  every  number  is  the  product  of 
a  unique  combination  of  prime  numbers. 

It  is  useful  to  compare  AM's  task  to  the  standard  experimental  task  of 
concept  Induction,  such  as  that  studied  by  Bruner  et  al.  (1956).  In 
standard  concept  induction,  a  set  of  examples  is  provided  by  the 
experimenter,  with  some  positive  examples  and  some  negative  examples 
determined  by  a  rule,  and  the  subject's  task  is  to  Induce  the  rule. 
Hypotheses  are  generated  by  the  subject  and  tested  with  information  about 
further  examples  until  the  correct  concept  has  been  found.  Each  hypothesis 
that  is  generated  is  itself  a  concept,  in  the  sense  that  it  provides  a  rule 
for  classifying  the  stimuli.  The  main  problem-solving  work  is  in  finding 
which  rule  is  correct. 

AM's  task  i3  not  defined  as  well,  in  two  respects.  First,  the 
examples  are  not  provided  by  an  experimenter,  but  rather  are  produced  by 
AM.  Second,  AM  does  not  have  a  specified  criterion  of  correctness  for  the 
concepts  that  it  generates.  Instead,  AM  evaluates  its  concepts  by  some 
criteria  of  importance,  based  in  part  on  how  easy  it  i3  to  generate 
examples  of  the  concept. 
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AM's  knowledge  of  concepts  is  organized  with  a  set  of  facets , 
including  some  that  are  standard  for  semantic  networks,  such  as 
generalizations,  specializations,  and  examples,  and  others  that  are 
especially  useful  in  the  domain  of  mathematics,  such  as  objects  that  are  in 
the  domain  or  range  of  a  function.  Facets  also  hold  procedural 
information,  such  as  methods  for  testing  whether  an  object  is  an  example  of 
the  concept.  AM's  reasoning  activity  is  organized  as  a  set  of  tasks ,  each 
involving  a  concept  and  one  of  its  facets.  Examples  of  tasks  include 
filling  in  examples  of  a  concept  or  forming  a  generalization  or  a  canonical 
representation  of  a  concept.  Tasks  that  are  proposed  are  placed  an  an 
agenda,  and  choice  of  a  task  to  perform  is  based  on  an  evaluation  of  the 
reasons  for  the  task,  including  the  importance  of  concepts  for  which  the 
task  would  contribute  new  information.  Heuristics  that  contribute  to  the 
development  of  new  concepts  include  efforts  to  form  a  more  general  concept 
if  an  existing  concept  has  very  few  examples,  and  to  form  new 
representations  that  clarify  the  relations  between  concepts. 

We  note  that  AM  does  not  really  do  mathematics  in  the  usual  sense.  It 
has  no  concept  of  deductive  consequence,  and  thus  does  not  develop  a  body 
of  concepts  and  principles  with  a  formal  structure.  Even  so,  it  provides 
an  example  of  a  system  that  goes  well  beyond  the  knowledge  that  it  is  given 
initially,  moving  into  a  conceptual  domain  that  is  quite  distinct  from  that 
of  its  initial  concepts. 
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Inducing  Quantitative  Regularities.  A  system  called  Bacon  induces 
formulas  from  numerical  data  (Langley,  1981;  Langley,  Bradshaw  &  Simon, 
1980).  The  data  are  values  of  some  variables  that  are  controlled  and  other 
variables  that  are  measured;  a  simple  example  is  in  Table  10.  The  goal  is 
to  find  a  formula  that  describes  the  relation  between  the  variables,  in 
this  case  distance  and  time.  The  two  components  of  the  problem  space  are 
the  subspace  of  stimuli,  the  set  of  data,  and  the  space  of  structures,  the 
set  of  formulas  with  the  variables  that  are  included  in  the  data. 

A  simpler  approach  than  Bacon's  is  adequate  for  relatively  simple 
induction  problems.  This  simpler  approach  tries  to  fit  alternative 
formulas  that  are  known  in  advance.  For  example,  for  Table  10,  a  linear 
function  can  be  tried,  and  the  discrepancy  that  is  noted  shows  that  there 
is  positive  acceleration.  This  suggests  trying  a  quadratic  formula,  which 
fits  the  data.  Generate-and-test  methods  of  this  kind  have  been  analyzed 
by  Huesmann  and  Cheng  (1973)  and  by  Gerwin  (1974),  with  supporting 
experimental  data. 

The  task  of  inducing  formulas  can  become  unmanageable  for  a  simple 
generate-and-test  method  if  there  are  several  variables  that  can  be  related 
in  complex  ways.  For  example,  Bacon  is  able  to  induce  Coulomb's  Law, 
relating  electrical  force  to  the  charges  on  two  bodies  and  the  distance 
between  them:  f  *  qjq2/dl;  an<^  a  formula  for  the  electrical  current  in  a 
wire  connected  to  a  battery  and  a  metal  rod,  depending  on  the  temperature 
differential  of  the  bar,  the  internal  resistance  of  the  battery,  and  the 
length  and  diameter  of  the  wire:  I  ■  T/(R  +  L/D2).  The  set  of  formulas 
that  includes  these  is  extremely  large,  and  it  seems  unlikely  that  simple 
equation  fitting  would  be  an  effective  method  for  inducing  formulas  of  this 
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Data  for  a  Simple  Induction  Problem 


Time  Distance 

1  0.98 

2  3.92 

3  8.(32 

4  15.68 


5 


24.50 
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complexity. 

Bacon's  search  method  uses  properties  of  the  data  to  guide  formation 
of  hypotheses.  We  have  discussed  other  Induction  systems  with  this 
property,  including  the  concept-induction  strategy  of  focussing,  described 
by  Bruner  et  al.  (1956),  the  method  of  inducing  patterns  in  letter 
sequences  studied  by  Simon  and  Kotovsky  (1963),  and  AM's  heuristics  for 
generating  new  concepts  based  on  properties  of  examples.  Bacon's 
heuristics  involve  properties  of  quantitative  data  and  thus  differ,  as  one 
would  expect  them  to,  from  the  heuristics  of  other  systems  such  as  AM, 
where  the  data  involve  categories  of  examples  and  sets  of  defining 
features.  Bacon's  use  of  data  has  the  further  Interesting  feature  of 
creating  new  data  in  the  process  of  evaluating  its  hypotheses.  In 
evaluating  a  hypothesis.  Bacon  calculates  values  of  a  new  function  of 
available  data,  and  if  the  hypothesis  does  not  succeed,  those  values  become 
part  of  the  data  available  to  Bacon  for  further  problem  solving.  Thus,  an 
attempt  to  solve  the  problem  may  be  unsuccessful,  but  it  leaves  new  results 
that  may  be  instrumental  in  a  later  attempt  that  succeeds. 

Bacon's  basic  method  is  to  search  for  a  function  of  data  that  gives 
constant  values  across  experimental  conditions.  As  an  example,  the  formula 
for  the  data  in  Table  10  is  d  »  kt2  ,  where  k  is  a  constant;  the  form  in 
which  Bacon  discovers  the  law  is  d/t2  «  k. 

Bacon  uses  heuristic  rules  to  form  hypotheses,  consisting  of  functions 
of  variables  in  its  data  base,  that  might  give  constant  values.  For 
example,  if  two  quantities  Increase  or  decrease  together,  Bacon  forms  their 
ratio  as  a  new  quantity  to  be  considered.  If  one  variable  decreases  as 
another  increases.  Bacon  forms  their  product  as  a  new  quantity.  These 
heuristics,  and  another  that  forms  linear  functions  of  variables,  enable 
Bacon  to  induce  relatively  complex  functions.  (The  first  two  are 
sufficient  for  the  problem  in  Table  1>1.  First,  note  that  t  and  d  increase 
together,  and  form  the  ratio  t^d.  This  decreases  with  t,  so  form  the 
product  t2/d.  This  quantity  is  constant  across  the  observations.) 

Some  other  heuristic  methods  are  also  used,  including  definition  of 
"intrinsic  variables"  as  properties  of  objects  that  are  associated  with 
constant  values  of  quantities,  and  attempts  to  find  a  common  divisor  for 
values  of  intrinsic  variables  that  have  been  induced.  These  heuristics 
enable  induction  of  properties  such  as  the  resistances  of  different  wires 
from  measurements  of  current,  and  the  atomic  and  molecular  weights  of 
chemicr’  lements  from  data  about  weights  and  volumes  of  elements  and 
compo  ivolved  in  chemical  reactions. 

ted  previously  that  induction  problems  can  also  be 
unde.  ‘oblems  of  design,  especially  when  the  structures  that  are 

induct  pressed  explicitly  as  formulas.  This  view  is  particularly 

appropriate  ..a  the  case  of  Bacon's  induction  of  formulas.  Consider  the 
task  as  construction  of  a  formula  using  symbols  for  the  variables  in  the 
problem.  Bacon's  heuristics  then  are  rules  for  forming  combinations  of  the 
symbols  that  may  satisfy  the  problem  criterion.  If  a  formula  does  not 
solve  the  problem,  it  may  provide  part  of  the  formula  that  is  needed. 

Thus,  the  process  of  search  with  construction  of  partial  solutions, 
characteristic  of  design  problems,  provides  an  appropriate  characterization 
of  Bacon's  process  of  induction. 
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3acon  is  not  intended  as  a  complete  simulation  of  cognitive  processes 
in  scientific  research,  where  hypotheses  about  causal  mechanisms  often  play 
a  critical  role  in  decisions  to  measure  variables  or  to  examine 
quantitative  relationships.  Even  so,  it  provides  a  demonstration  that 
quite  simple  heuristics  are  sufficient  to  produce  quite  complex  inductive 
conclusions  from  quantitative  data,  and  it  is  reasonable  to  suppose  that 
these  heuristics  correspond  to  significant  components  of  complex  scientific 
reasoning. 

Inducing  Molecular  Structure .  Another  scientific  task  that  has  been 
investigated  is  induction  of  the  molecular  structure  of  organic  compounds. 

A  system  called  Dendral  induces  molecular  structure  from  data  in  the  form 
of  mass  spectra  (Lindsay,  Buchanan  &  Feigenbaum,  1980).  Amass  spectrum  is 
a  set  of  quantities  of  the  fragments  of  various  sizes  that  are  produced 
when  molecules  of  a  substance  are  bombarded  by  electrons. 

Like  AM  and  Bacon,  Dendral  performs  induction  using  heuristic  search. 
An  important  difference  is  that  Dendral  uses  search  heuristics  that  are 
based  on  principles  that  are  specific  to  the  domain  of  organic  chemistry, 
whereas  AM's  methods  apply  in  any  domain  with  a  structure  of  categorical 
concepts,  and  Bacon's  methods  can  be  applied  to  any  quantitative  data. 

Dendral' s  method  of  induction  has  three  main  stages.  First,  the 
chemical  formula  of  the  compound  is  inferred  from  features  of  the  mass 
spectrum.  Then  hypotheses  about  molecular  structures  are  generated  with 
constraints  based  on  knowledge  of  the  class  of  compounds  that  the  substance 
belongs  to.  Finally,  the  hypotheses  are  tested  by  comparing  their 
implications  with  the  quantitative  details  of  the  mass  spectrum,  and  the 
hypothesis  is  chosen  that  provides  the  best  agreement  with  the  data. 

The  data  used  to  infer  the  chemical  formula  are  the  peaks  in  the  mass 
spectrum.  The  largest  mass  represented  probably  is  the  mass  of  the 
molecular  ion,  or  may  be  smaller  than  the  molecular  ion  by  one  fragment. 
Differences  between  peaks  usually  correspond  to  the  masses  of  fragments 
that  are  broken  off  in  the  bombardment.  Dendral  uses  the  value  of  the 
largest  peak  and  the  interpeak  distances,  along  with  knowledge  of 
chemistry,  to  infer  one  or  more  chemical  formulas  that  are  consistent  with 
the  spectrum. 

Dendral 's  next  task  is  to  generate  possible  molecular  structures,  with 
the  ions  in  the  formula  arranged  in  various  ways  that  are  consistent  with 
known  possible  arrangements.  There  are  many  millions  of  possibilities  for 
most  problems,  so  Dendral  formulates  constraints  based  on  knowledge  of  the 
class  of  compounds  that  the  sample  belongs  to.  With  the  constraints, 
Dendral  constructs  hypotheses  about  molecular  structure  with  a  method  that 
first  determines  the  maximum  number  of  rings  in  the  structure,  then 
constructs  the  possible  partitions  of  ions  into  rings  and  remaining 
components,  and  finally  constructs  the  possible  structures  for  each 
possible  partition. 

Finally,  Dendral  tests  its  many  hypotheses,  using  the  quantitative 
details  of  the  mass  spectra.  In  the  different  hypothesized  structures, 
different  components  are  separated  by  different  numbers  of  bonds; 
therefore,  there  are  differences  in  the  likelihoods  that  they  will  occur 
together  in  a  fragment.  Assuming  that  fragments  are  produced  by  breaking 
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one  or  two  bonds  at  once,  predictions  are  made  about  the  relative  amounts 
of  material  to  be  found  at  each  peak  in  the  spectrum,  and  the  structure 
that  fits  the  data  best  is  chosen. 

Note  that  Dendral's  task,  like  3acon's,  involves  constructing  an 
explicit  formula  to  represent  the  structure  that  it  induces.  Thus  its 
method  can  also  be  considered  as  solving  a  problem  of  design,  where  the 
materials  for  the  construction  are  symbols  that  represent  the  atomic 
components  of  chemical  compounds,  and  the  chemical  knowledge  that  it  uses 
constrains  the  search  for  an  arrangement  of  those  materials  that  satisfies 
the  criterion  of  agreement  with  the  mass  spectrum. 

IV. D.  Diagnostic  Problem  Solving 

We  conclude  this  section  by  discussing  problem  solving  that  involves 
troubleshooting  in  electronics  and  diagnosis  in  medicine.  In  these  tasks, 
the  problem  solver  has  a  space  of  stimuli  consisting  of  one  or  more 
symptoms  and  further  information  that  can  be  obtained  by  performing  tests. 
The  space  of  structures  is  a  set  of  possible  causes  of  the  symptoms,  faulty 
components  in  electrical  circuits  or  disease  states  in  the  case  of 
medicine . 

In  addition  to  its  characteristics  of  inductive  problem  solving, 
diagnostic  problem  solving  also  has  components  of  operational  thinking, 
because  it  is  based  on  the  goal  of  curing  the  patient's  illness  or 
repairing  the  device.  Thus,  the  information  and  conclusions  in  the 
diagnosis  are  directed  toward  making  a  decision  about  a  remedial  treatment 
that  should  be  applied. 

IV. D. 1 .  Troubleshooting .  The  task  in  troubleshooting  is  to  determine 
which  of  the  many  components  of  an  electronic  system  is  operating 
incorrectly,  causing  the  system  to  function  improperly.  There  may  be  more 
than  one  fault,  but  it  simplifies  the  problem  greatly  to  assume  that  there 
is  a  single  fault  in  the  system. 

In  a  general  way,  troubleshooting  resembles  the  task  of  inducing 
categorical  concepts  when  the  subject  chooses  the  stimuli  for  which 
information  is  given.  In  concept  induction,  the  problem  solver  obtains 
information  by  asking  whether  a  specific  stimulus  is  a  positive  or  a 
negative  instance.  In  troubleshooting,  information  is  obtained  by  taking 
readings  of  voltage  or  current  at  specific  locations  in  the  circuit.  In 
both  cases,  there  are  many  possible  hypotheses  to  be  considered,  but  the 
set  of  possibilities  can  be  specified  —  in  concept  induction,  it  Is  the 
set  of  logical  combinations  of  the  stimulus  attributes,  and  in 
troubleshooting,  it  is  the  set  of  possible  faults  of  components.  These 
similarities  of  the  tasks  are  correlated  with  an  important  resemblance  in 
the  effective  methods  of  working  on  the  problems.  The  focussing  strategy 
in  concept  induction  uses  Information  obtained  about  instances  to  eliminate 
classes  of  hypotheses,  rather  than  considering  each  hypothesis  individually 
as  is  done  in  the  less  effective  scanning  strategy  (Bruner  et  al.,  1956). 
Similarly,  in  troubleshooting,  an  important  component  of  strategy  is  to 
conduct  tests  that  will  enable  elimination  of  sets  of  possible  faults  from 
consideration.  Use  of  this  strategy  is  enabled  by  general  knowledge  about 
electronic  components  as  well  as  by  knowledge  of  the  specific  circuit  in 
the  problem,  as  we  discuss  below.  This  requirement  of  knowledge  to  support 
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the  process  of  induction  is  analogous  to  the  role  played  in  concept 
induction  by  knowledge  of  the  alternative  logical  forms  (conjunction, 
disjunction,  etc.)  and  the  truth-table  combinations  that  correspond  to  them 
(Dodd  et  al.,  1971),  although  the  knowledge  required  in  troubleshooting  is 
considerably  more  elaborate. 

A  model  of  troubleshooting  is  included  in  a  system  called  Sophie  that 
provides  computer-based  instruction  for  trainees  in  electronics  maintenance 
(Brown,  Burton  &  deKleer,  1983).  The  troubleshooting  system  provides  a 
model  for  the  student  to  observe  in  learning  how  to  diagnose  faults  in  a 
circuit.  The  student  can  specify  a  fault  in  the  circuit,  and  Sophie  then 
can  solve  the  problem  of  diagnosing  the  fault,  performing  a  series  of  tests 
to  obtain  readings  of  current  or  voltage  at  various  points  in  the  circuit, 
forming  hypotheses  about  the  fault,  and  eventually  arriving  at  a  decision 
about  it.  Sophie  has  general  knowledge  about  electronics  and  an  explicit 
representation  of  strategy  that  enables  it  to  provide  explanations  to 
students  for  tests  that  it  is  performing,  regarding  both  principles  of 
electronics  and  the  strategic  purposes  of  its  activity.  Sophie's 
troubleshooting  knowledge  is  also  used  to  evaluate  the  problem-solving 
performance  of  students,  by  providing  a  series  of  problem-solving  steps 
that  can  be  compared  with  the  steps  taken  by  students. 

Sophie's  knowledge  for  troubleshooting  has  four  main  components:  two 
components  of  electronics  knowledge,  a  component  of  knowledge  for  making 
specific  inferences  and  a  component  of  strategic  knowledge.  One  component 
is  general  knowledge  about  electronics  in  the  form  of  "experts"  that  have 
Information  about  characteristics  of  different  kinds  of  electronic 
components  such  as  resistors  and  diodes.  These  experts  can  use  data 
obtained  from  readings  to  calculate  values  for  other  variables,  assuming 
normal  functioning  of  components  of  the  circuit;  these  inferred  values 
then  can  be  compared  with  actual  readings  of  those  variables. 

A  second  component  of  Sophie's  knowledge  is  information  about  the 
specific  circuit  that  is  used  for  instruction.  The  circuit  is  represented 
hierarchically  as  a  set  of  modules  with  submodules  and  components. 

Possible  functional  states  of  each  module  and  component  are  represented, 
including  normal  functioning  and  possible  fault  states.  Experimental 
evidence  obtained  by  Egan  and  Schwartz  (1979)  is  consistent  with  a 
hypothesis  that  human  electronics  experts  represent  circuits  in  ways 
similar  to  Sophie's.  Egan  and  Schwartz  showed  that  experts  encode 
information  from  circuit  diagrams  rapidly,  similar  to  performance  by 
experts  in  other  domains  such  as  chess  (see  Section  III.B),  and  that 
functional  modules  made  up  of  components  that  are  spatially  contiguous  in 
the  diagram  play  an  important  role  in  the  performance. 

A  third  part  of  Sophie's  knowledge  involves  specific  actions  that 
occur  during  troubleshooting.  This  knowledge  is  in  the  form  of  rules  for 
making  inferences  about  the  states  of  modules  and  components  of  the 
circuit.  Readings  are  used  to  eliminate  hypotheses  about  faults  by  showing 
that  a  module  is  functioning  normally,  and  for  propagating  inferences  in 
the  hierarchical  representation;  for  example,  if  a  component  is  faulted, 
then  all  the  modules  that  contain  that  component  must  also  be  faulted. 
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The  fourth  component  of  knowledge  is  Sophie's  strategy,  a 
breadth-first  search  method  with  backtracking.  Sophie  considers  all  the 
possible  states  that  can  occur,  according  to  its  representation  of  the 
circuit,  and  eliminates  possible  fault  states  on  the  basis  of  readings  that 
are  consistent  with  normal  functioning.  It  assumes  normal  functioning  of 
components  until  there  is  a  reading  that  conflicts  with  that  assumption; 
however,  it  keeps  a  record  of  the  assumptions  used  in  its  inferences,  and 
if  information  contradicts  an  assumption  made  earlier,  inferences  based  on 
that  assumption  are  revised. 

IV. D. 2.  Medical  Diagnosis.  In  medical  diagnosis,  as  in 
troubleshooting,  a  system  —  in  this  case,  a  human  body  —  is  functioning 
improperly,  and  the  inductive  task  is  to  infer  the  cause  of  the 
malfunction.  Also  as  in  troubleshooting,  the  purpose  of  the  diagnosis  is 
to  determine  a  treatment  that  can  remedy  the  malfunction,  and  the 
diagnostic  activity  is  conducted  in  a  way  that  provides  information 
relevant  to  choosing  a  treatment. 

Several  systems  have  been  developed  that  solve  diagnostic  problems  in 
various  domains  of  medicine,  including  diagnosis  of  infectious  agents  and 
prescription  of  antibiotics  (Shortliffe,  1976),  prescription  of  digitalis 
therapy  for  cardiac  patients  (Silverman,  1975),  and  diagnosing  and 
prescribing  treatment  for  varieties  of  glaucoma  (Weiss,  Kulikowski  &  Safir, 
1978).  (For  a  review,  see  Ciesielski,  Bennett  6  Cohen,  1977.)  We  discuss 
one  system,  Caduceus ,  which  performs  general  diagnosis.  We  also  discuss 
some  empirical  studies  of  diagnostic  problem  solving  by  physicians  with 
varying  amounts  of  training  and  experience. 

A  Model  of  Knowledge  for  General  Diagnosis .  Knowledge  used  in  general 
medical  diagnosis  has  been  investigated  in  the  context  of  a  model  named 
Caduceus  (Miller,  Pople  &  Myers,  1982;  Pople,  1982).  The  knowledge  that 
Caduceus  has  for  diagnosing  diseases  is  similar  in  important  ways  to  the 
knowledge  used  by  Sophie  for  diagnosing  faults  in  electronic  circuits.  It 
is  hierarchical  in  form,  enabling  systematic  search  in  the  space  of 
hypotheses.  Internist  also  has  rules  that  infer  hypotheses  based  on 
symptoms  and  test  results  and  that  propagate  inferred  information  using  the 
hierarchical  structure  of  its  knowledge. 

Caduceus's  knowledge  about  diseases  is  of  two  kinds,  organized  in  two 
separate  but  related  graph  structures.  One  of  these,  called  a  nosological 
graph,  provides  a  taxonomy  of  diseases  based  on  the  organs  of  the  body  that 
are  involved  and  on  etiological  factors.  This  graph  provides  groupings  of 
diseases  based  on  their  manifestations.  The  other  knowledge  structure, 
called  a  causal  graph,  contains  information  about  disease  states  and 
processes.  The  causal  graph  contains  technical  concepts  of  pathology  that 
refer  to  3tates  of  disease,  such  as  cardiogenic  shock. 

Caduceus  has  the  goal  of  identifying  one  or  more  disease  entitles  that 
provide  a  complete  explanation  of  a  set  of  symptoms  and  findings  in  the 
case.  Subproblems  are  formulated  from  findings  that  are  not  yet  integrated 
in  an  explanatory  network;  these  constitute  diagnostic  tasks  that  are 
generated  by  the  system.  Identification  of  the  disease  depends  mainly  on 
the  nosological  graph;  this  hierarchical  structure  is  used  in  a  top-down 
search  to  narrow  the  possible  disease  entities.  The  information  about 
disease  states  and  processes  in  the  causal  graph  provides  links  between 
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hypothesized  disease  entities  and  the  specific  symptoms  and  test  results 
that  are  available.  Caduceus  concludes  its  diagnostic  analysis  when  an 
explanatory  network  has  been  developed  that  includes  all  the  available 
symptoms  and  findings. 

Empirical  Studies  of  Diagnostic  Performance .  An  extensive  study  of 
performance  in  diagnostic  problems  was  conducted  by  Feltovich  (1981;  also 
described  in  Johnson,  Duran,  Hassebrock,  Moller,  Prietula,  Feltovich,  & 
Swanson,  1981).  The  results  consistent  were  the  general  properties  of  the 
Caduceus  model.  They  also  provide  information  about  characteristics  of 
knowledge  for  diagnosis  at  different  levels  of  experience  and  expertise. 
Feltovich  obtained  problem-solving  protocols  for  cases  in  pediatric 
cardiology  from  individuals  varying  in  experience  from  fourth-year  medical 
students  who  had  just  completed  a  six-week  course  in  pediatric  cardiology 
to  two  professors  who  had  more  than  20  years  experience  in  the 
subspecialty.  Information  from  five  cases  was  presented  serially  and  the 
physicians  gave  their  hypotheses  and  other  thoughts  about  the  cases, 
attempting  to  arrive  at  a  correct  diagnosis. 

The  performance  of  experts  indicated  knowledge  that  differed  from  that 
of  novices  in  several  ways,  consistent  with  the  general  features  of  expert 
knowledge  discussed  in  Section  III.B.  The  major  difference  was  that 
experts  had  more  integrated  knowledge  about  diseases.  Experts  also  had 
more  detailed  knowledge  of  variations  of  disease  states  and  more  precise 
knowledge  of  relationships  between  diseases  and  symptoms.  This  was 
indicated  in  the  performance  of  one  advanced  expert  by  his  mentioning 
groups  of  hypotheses  that  were  supported  by  the  findings  presented  first, 
with  later  information  used  to  narrow  the  range  of  possibilities.  The 
other  advanced  expert  used  a  more  depth-first  strategy,  proposing  a  likely 
hypothesis  based  on  preliminary  findings,  but  modifying  the  hypothesis  in  a 
flexible  way  when  later  evidence  provided  counterindications.  The 
knowledge  of  novices  was  primarily  in  the  form  of  a  few  specific  disease 
forms  used  in  textbook  cases.  Novices  responded  to  early  evidence  by 
proposing  reasonable  hypotheses,  but  were  less  likely  to  recognize  the 
significance  of  later  evidence  and  change  their  hypotheses  when  this  was 
indicated.  The  sets  of  hypotheses  mentioned  by  novices  during  problem 
solving  were  significantly  smaller  than  those  of  the  experts. 

Similar  conclusions  regarding  expert  knowledge  for  diagnosis  were 
supported  in  a  study  of  expert  and  novice  radiologists  (Lesgold,  Feltovich, 
Glaser  &  Wang,  1981).  Lesgold  et  al.  found  that  in  reading  x-ray  films, 
experts  generated  representations  in  a  three-dimensional  system,  using 
salient  features  to  generate  initial  hypotheses  that  were  refined  or 
modified  on  the  basis  of  more  detailed  features.  Knowledge  for  recognizing 
features  associated  with  abnormalities  appeared  to  be  well  integrated  with 
general  knowledge  of  anatomy..  The  integration  of  experts'  knowledge  was 
evidenced  by  their  ability  to  use  features  noted  early  as  constraints  on 
later  interpretations  (cf.  Stefik,  1981).  Interpretations  of  novices  (in 
this  case,  first-year  residents  in  radiology)  depended  more  on  finding  an 
explanation  for  a  few  features,  with  a  tendency  for  other  details  to  be 
assimilated  to  the  initial  hypothesis  rather  than  used  to  generate 
alternative  hypotheses  or  modif ications. 


Section  IV,  Induction 
Diagnostic  Problem  Solving 


Page  1 1 1 


It  is  important  to  note  the  close  similarity  of  conclusions  from  these 
studies  of  expert  diagnosticians  in  medicine  and  the  studies  of  expert 
performance  in  other  problem-solving  domains,  especially  physics  and  chess. 
Present  findings  indicate  that  a  major  source  of  expert  performance  is  the 
expert's  ability  to  represent  problems  successfully,  and  that  this  results 
from  the  expert's  having  a  well  integrated  structure  of  knowledge  in  which 
patterns  of  features  in  the  problem  are  associated  with  concepts  at  varying 
levels  of  generality,  enabling  efficient  search  for  hypotheses  about  the 
salient  features  of  the  problem  that  cannot  be  observed  directly  as  well  as 
methods  and  operations  to  be  used  in  solving  the  problem. 
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V.  Evaluation  of  Deductive  Arguments 

The  relation  between  human  reasoning  and  formal  logic  has  long  been  a 
subject  of  discussion  and  debate,  and  for  some  decades,  a  subject  for 
experiment  as  well.  It  is  generally  agreed  that  human  "logical"  reasoning 
does  not  always  conform  to  the  laws  of  formal  logic.  Formal  logic  is  a 
normative  theory  of  how  people  ought  to  reason,  rather  chan  a  description 
of  how  they  do  reason.  It  is  important,  then,  to  develop  a  descriptive 
theory  of  human  reasoning  to  compare  and  contrast  with  the  logic  norms. 

Experiments  aimed  at  developing  a  theory  of  human  reasoning  have 
mostly  set  tasks  of  judging  the  correctness  or  incorrectness  of  formal 
syllogisms.  These  tasks  require  application  of  the  rules  of  deductive 
argument  which  are  special  in  some  ways,  and  correct  performance  depends  on 
the  subject's  knowledge  and  use  of  the  technical  rules  of  formal  deductive 
inference.  However,  the  processes  used  in  these  tasks  do  not  differ  in  any 
fundamental  way  from  those  involved  in  problem  solving  in  other  domains. 
Psychological  analyses  provide  no  basis  for  a  belief  in  deductive  reasoning 
as  a  category  of  thinking  processes  that  differ  from  other  thinking 
processes,  other  than  in  the  special  set  of  operators  that  are  permitted  in 
rigorous  deductive  arguments.  As  Woodworth  put  the  matter,  "Induction  and 
deduction...  are  distinguished  as  problems  rather  than  processes" 
(Woodworth,  1938,  p.  801). 

We  discuss  studies  of  two  tasks.  First,  we  discuss  propositional  and 
categorical  syllogisms,  which  present  arguments  in  the  sentential  and 
predicate  calculus.  Subjects  frequently  make  errors  in  evaluating  these 
syllogisms,  and  research  has  attempted  to  explain  why  the  reasoning  process 
differs  from  correct  logical  inference.  Second,  we  discuss  linear 
syllogisms,  which  present  arguments  that  depend  on  transitivity  of  order 
relations.  Subjects  make  the  transitive  inferences  in  these  tasks  without 
difficulty,  and  psychological  analyses  have  focused  on  the  cognitive 
representation  of  information  in  the  syllogisms. 

V.A.  Propositional  and  Categorical  Syllogisms 

Subjects  in  experiments  on  propositional  or  categorical  syllogisms  are 
asked  to  judge  the  validity  of  arguments  such  as  the  following  (invalid) 
propositional  syllogism: 

If  I  push  the  left-hand  button,  the  letter  T  appears. 

I  did  not  push  the  left-hand  button. 

Therefore,  the  letter  T  did  not  appear. 

The  major  premise  states  what  will  happen  if  the  button  is  pushed.  It  says 
nothing  about  wnat  will  or  will  not  happen  if  the  button  is  not  pushed. 
Hence  the  conclusion  does  not  follow  from  the  premises.  Yet  in  a  typical 
experiment  (Rips  &  Marcus,  1977)  a  fifth  of  the  subjects  accepted  this  as  a 
valid  syllogism. 

Categorical  syllogisms  in  the  predicate  calculus  involve  statements 
containing  the  terms  some ,  all ,  and  no.  An  example  of  a  (valid) 
categorical  syllogism  is 

Some  jewels  are  diamonds. 

Ail  diamonds  are  valuable. 
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Therefore,  some  jewels  are  valuable. 

Again,  human  subjects  make  frequent  mistakes  in  judging  whether  certain 
kinds  of  categorical  syllogisms  are  valid.  For  example,  subjects  are  very 
likely  to  mistakenly  judge  that  the  following  argument  is  a  valid  syllogism 
(Johnson-Laird  4  Steedman,  1978): 

Some  As  are  Bs . 

Some  Bs  are  Cs . 

Therefore,  some  As  are  Cs . 

In  experiments  on  syllogistic  reasoning,  the  type  of  syllogism 
presented  is  most  commonly  taken  as  the  independent  variable,  and  the 
numbers  of  subjects  making  errors  on  syllogisms  of  different  kinds  is 
measured.  By  comparing  the  error  rates  for  different  kinds  of  syllogisms, 
the  experimenter  seeks  to  formulate  and  test  hypotheses  about  the  cognitive 
processes  that  suojects  use  to  make  such  syllogistic  judgments. 

For  example,  subjects  will  often  accept,  "No  As  are  Bs  and  no  Bs  are 
Cs ,  therefore  no  As  are  Cs;"  while  they  will  almost  always  reject,  "No  As 
are  Bs  and  no  Bs  are  Cs ,  therefore  all  As  are  Cs."  Yet  both  syllogisms  are 
equally  invalid.  Such  errors  of  reasoning  have  sometimes  been  attributed 
to  an  "atmosphere  effect."  In  the  example  above,  "no"  appears  in  the 
premises,  therefore  "no"  is  more  acceptable  than  "all"  in  the  conclusion 
(Woodworth  &  Sells,  1935).  Alternatively,  some  investigators  have  claimed 
that  che  reason  for  these  errors  is  that  the  quantifiers  and  connectives, 
all ,  some  ,  no,  if . . .then ,  and ,  or ,  do  not  have  the  same  meaning  in  natural 
language  as  they  do  in  formal  logic  (Braine,  1978).  According  to  this 
hypothesis,  since  che  experimenter  judges  the  correctness  of  answers  by 
their  conformity  to  the  rules  of  formal  logic  while  che  subjects  are  using 
che  natural  language  meanings,  errors  will  be  made  when  the  two  kinds  of 
meaning  diverge. 

Errors  and  latencies  in  reasoning  tasks  depend  not  only  on  the  form  of 
the  syllogism,  but  also  on  whether  or  not  it  has  meaningful  content 
(vJilkins,  1928).  Thus,  subjects  may  respond  differently  to  the  syllogism, 
"If  some  As  are  Bs  and  some  Bs  are  Cs ,  then  some  As  are  Cs ,"  and  the 
syllogism  "If  some  birds  have  blue  eyes  and  some  blue-eyed  creatures  are 
human,  then  some  birds  are  human." 

In  general,  subjects'  error  rates  are  lower  when  syllogisms  have 
meaningful  concent,  but  there  is  an  important  class  of  exceptions. 

Subjects  often  reject  valid  syllogisms  when  the  conclusions  are  contrary  to 
facts  known  to  them.  "If  all  horses  have  four  feet  and  all  fish  are 
horses,  then  all  fish  have  four  feet,"  may  be  rejected  by  subjects  who  know 
that  fish  are  footless.  The  rate  of  rejection  rises  when  subjects  react 
emotionally  to  the  conclusion.  "If  drug  addiction  is  a  disease  and 
diseases  should  not  be  punished,  Chen  drug  addiction  should  not  be 
punished,"  will  more  likely  be  rejected  by  subjects  who  support  strong 
measures  against  drug  usage  than  Chose  who  do  not  (Janis  &  Frick,  1943; 
Lefford,  1946).  Conversely,  subjects  often  accept  invalid  syllogisms  when 
che  conclusions  are  consistent  with  their  knowledge  about  the  wrld  or 
their  preferences. 
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All  of  these  findings  must  be  stated  as  "tendencies,"  since  many 
subjects  who  mattes  errors  on  some  syllogisms  of  a  certain  form  do  not  make 
such  errors  consistently.  Moreover,  there  are  large  individual  differences 
among  subjects.  For  example,  subjects  who  have  had  training  in  formal 
logic  generally  make  fewer  errors  —  not  surprisingly  —  than  subjects  who 
have  not  had  such  training. 

While  human  syllogistic  reasoning  conforms  to  some  broad 
generalizations  of  the  sorts  that  have  been  mentioned  already,  the  findings 
derived  from  experiments  are  complex  and  confusing.  In  the  past  several 
years,  a  few  investigators  have  sought  to  cut  through  the  confusion  by 
creating  models  of  the  inference  process  or  some  components  of  it.  Tne 
attempt  to  create  such  models  has  revealed  features  of  the  reasoning  task 
that  had  not  been  entirely  obvious. 

First,  any  one  of  a  wide  range  of  strategies  might  be  used  by  subjects 
to  solve  the  problems,  and  there  is  no  reason  to  believe  that  all  subjects 
use  the  same  strategies.  Subjects  who  reason  by  vague  verbal  analogies 
could  succumb  to  the  atmosphere  effect,  wnile  other  subjects  who  create 
semantic  images  of  the  propositions  and  reason  by  operating  on  those  images 
might  make  quite  different  errors.  (Certain  syllogisms  might  require  the 
creation  of  images  more  complex  than  a  subject  could  handle  in  memory.) 
Subjects'  knowledge  of  logical  inference  can  be  embedded  in  formal  axioms 
or  in  inference  rules,  with  different  consequences  for  the  likelihood  of 
error.  The  axioms  that  define  connectives  or  the  inference  rules  might 
conform  to  some  natural  logic  that  deviates  from  the  formal  logic  of  the 
textbooks . 

Several  quite  successful  recent  modelling  efforts  have  used  the  idea 
that  evaluation  of  syllogisms  is  a  form  of  problem  solving  similar  to  that 
discussed  in  Section  II. A.  Using  a  set  of  inferential  operators,  the 
subject  attempts  to  confirm  the  conclusion  working  from  the  premises,  and 
accepts  the  conclusion  if  this  problem-solving  effort  succeeds.  Ihe 
process  typically  used  by  subjects  differs  from  the  task  of  finding 
explicit  proofs  in  that  tne  inferential  operators  are  not  expressed 
overtly,  and  of  course  need  not  correspond  completely  to  the  rules  of 
formal  logic. 

Models  of  evaluating  propositional  syllogisms  have  been  formulated  by 
Osherson  (1975),  Braine  (1978),  and  Rips  (1983).  These  models  are  based  on 
the  concept  of  natural  deduction,  discussed  by  Gentzen  (1935/1969).  A 
system  of  natural  deduction  is  a  form  of  production  system.  Rules  for 
making  inferences  specify  conditions  in  the  form  of  patterns  of 
propositions,  and  when  a  pattern  is  matched  in  premises,  the  inference  is 
made,  the  models  account  for  performance  by  postulating  sets  of  inference 
rules  assumed  to  be  used  implicitly  by  subjects.  Rips  also  formulated  a 
specific  process  of  applying  the  rules  and  forming  representations  of  the 
derivation.  An  interesting  feature  of  Rips's  formulation  is  the  inclusion 
of  suppositions  that  provide  a  backward-chaining  component  in  the  search 
process.  A  syllogism  is  judged  valid  if  the  system  can  generate  a 
derivation  of  the  conclusion  from  its  inference  rules. 


Section  V,  Evaluation  of  Deductive  Arguments  Page  115 

Propositional  and  Categorical  Syllogisms 

The  idea  that  sentential  syllogisms  are  evaluated  by  natural  deduction 
provides  an  interpretation  of  many  of  the  kinds  of  errors  that  occur  in 
syllogistic  reasoning.  Because  it  is  an  informal  reasoning  system,  it  is 
not  surprising  that  it  is  susceptiole  to  influence  by  general  Knowledge  and 
affect.  One  would  expect  performance  to  be  improved  if  subjects  were 
taught  a  more  explicit  procedure  for  verifying  the  applicability  of 
inference  rules  in  evaluating  syllogisms,  and  this  result  was  obtained  in 
the  domain  of  geometry  proofs  in  a  study  by  Greeno  and  Magone  (described  in 
Greeno,  1983). 

Models  of  reasoning  for  categorical  syllogisms  have  been  formulated  by 
Guyote  and  Sternberg  (1981)  and  by  Jonnson-Laird  and  Steedman  (1978). 

These  models  use  the  idea  that  the  information  in  premises  is  represented 
in  tne  form  of  examples;  for  example,  "Some  jewels  are  diamonds"  might  be 

represented  as  a  symbol  for  a  jewel  that  is  a  diamond  and  another  symbol 

for  a  jewel  that  is  not  a  diamond,  A  representation  is  formed  oased  on  the 
premises,  and  is  used  to  evaluate  the  conclusion.  Errors  occur  because  tne 
representations  are  incomplete;  the  examples  generated  by  the  system  often 
fail  to  exhaust  the  possibilities,  leading  to  incorrect  conclusions. 

V.B.  Linear  Syllogisms 

In  a  linear  syllogism,  premises  specify  ordered  relations  between 
pairs  of  objects,  and  questions  are  asked  about  pairs  for  wnich  the  order 
was  not  specified.  An  example  (Egan  S»  Grimes-Farrow,  1982)  is: 

Circle  is  darker  chan  square. 

Square  is  darker  than  triangle. 

Is  triangle  darker  than  circle? 

(An  alternative  is  to  ask  "Which  is  darkest?"  or  "Which  is  lightest?") 
Problems  are  presented  with  relations  expressed  differently,  such  as 
"Triangle  is  lighter  chan  square,"  or  "Triangle  is  not  as  dark  as  square," 
with  the  premise  information  given  in  different  orders,  and  with  different 
questions . 

To  answer  tha  question,  the  information  in  the  premises  must  be 
encoded  in  some  representation  that  enables  the  answer  to  be  derived. 

Three  hypotheses  about  representation  have  neen  considered. 

According  to  a  spatial  hypothesis  (DeSoto,  London  S  Handel,  1965, 
Huttenlocher ,  1968)  information  in  the  premises  is  integrated  into  an 
ordered  list,  possibly  using  an  image  in  which  symbols  are  spatially 
aligned.  A  representation  for  the  example  would  be  an  ordering  with  circle 
first,  square  second,  and  triangle  third,  perhaps  imagined  in  a  vertical 
line  with  the  circle  at  the  top.  Then  a  question  such  as  "Is  circle  dancer 
than  triangle?"  would  be  answered  by  comparing  the  positions  of  the  circle 
and  the  triangle  in  the  ordered  representation. 

A  second  hypothesis  (Clark,  1969)  is  that  the  representation  consists 
of  propositions  in  which  individual  objects  are  associated  with  values  of 
attributes.  For  the  example,  circle  would  be  associated  with  a  large 
degree  of  darkness,  square  with  a  medium  degree,  and  triangle  with  a  small 
degree.  A  question  would  be  answered  by  retrieving  representations  of  the 
objects  in  the  question  and  comparing  the  properties  associated  with  them. 
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The  third  hypothesis  is  that  representations  of  binary  relations  are 
stored  in  memory.  This  hypothesis  assumes  the  simplest  process  of 
representation,  since  information  in  memory  corresponds  directly  to  the 
information  in  the  premises.  To  answer  a  question,  however,  a  sequence  of 
propositions  has  to  be  retrieved;  for  example  to  answer  "Is  circle  darker 
than  triangle,"  both  "Circle  darker  than  square"  and  "Square  darker  than 
triangle"  have  to  be  retrieved. 

The  hypothesis  that  binary  relations  are  represented  is  ruled  out  by 
data  obtained  by  Potts  (1974),  who  had  subjects  study  paragraphs  containing 
series  with  six  terms  and  asked  questions  involving  pairs  that  varied  in 
their  separation;  with  the  ordering  A>B>C>D>E>F,  C>D?  has  separation  0, 
B>D?  has  separation  1,  3>E?  has  separation  2,  and  so  on.  If  binary 
relations  are  in  memory,  questions  with  greater  separation  should  take 
longer,  since  answers  to  these  questions  require  more  inferential  steps. 

The  finding  was  the  opposite:  items  with  greater  separation  required  less 
time  to  respond.  This  finding  has  also  been  obtained  with  comparisons 
involving  general  knowledge,  such  as  the  relative  sizes  of  animals  (Banks, 
1977). 


The  question  whether  premises  are  represented  by  an  integrated  spatial 
array  or  by  propositions  associating  properties  with  individual  objects  has 
been  harder  to  resolve,  Huttenlocher  (1968)  provided  an  argument  for  the 
spatial  nypothesis,  including  the  finding  chat  latency  is  shorter  when  the 
second  premise  has  the  third  individual  as  the  subject  of  the  sentence 
(e.g,,  A>B,  C<B  rather  than  A>B,  B>C) .  The  interpretation  is  chat  the 
subject  imagines  placement  of  the  new  object  in  a  spatial  array,  and  this 
is  easier  if  the  object  is  mentioned  as  the  sentence  subject  than  the 
sentence  object.  Clark  (1969)  argued  for  a  propositional  representation, 
presenting  evidence  that  performance  is  influenced  by  linguistic  factors 
such  as  the  congruence  of  questions  with  premises  (e.g.,  "A>B,  which  is 
greater?"  is  easier  than  "3<A,  which  is  greater?"). 

Sternberg  (1980)  formulated  models  chat  specify  stages  of  processing 
based  on  assumptions  of  a  spatial  or  a  propositional  representation  of 
premises.  He  also  formulated  a  model  that  combines  the  assumptions,  with 
linguistic  factors  influencing  an  initial  encoding  of  premises  and 
relations  among  propositions  influencing  a  process  of  converting  the 
information  into  an  integrated  spatial  array.  The  combined 
linguistic-spatial  model  provided  a  more  accurate  account  of  latency  data 
than  either  of  the  simpler  models  based  on  linguistic  or  spatial  factors. 

Several  investigators  have  provided  evidence  chat  linear  syllogisms 
are  not  solved  in  a  single  way  by  all  subjects;  rather,  different 
representations  are  used  by  different  individuals  (Mayer,  1979;  Sternberg 
&  Weill,  1980).  Egan  and  Grimes-Farrow' s  (1982)  evidence  was  particularly 
direct.  They  used  retrospective  protocols  obtained  after  solutions  of 
individual  problems.  The  protocols  indicated  that  some  subjects  used 
spsr  '.al  representations  consistently,  and  other  subjects  sometimes  formed 
representations  with  individual  objects  in  the  problem  associated  with 
differing  quantitative  values  of  attributes.  The  protocol  evidence  was 
substantiated  by  analyses  showing  different  influences  on  subjects' 
performance  depending  on  the  representations  they  reported  using.  The 
order  in  which  objects  were  mentioned  was  significant  for  subjects  who  used 
spatial  representations,  and  the  linguistic  factor  of  consistency  of  the 
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relational  term  used  was  significant  for  subjects  who  sometimes  used 
invidivual  object  propositions. 

V.C.  Conclusions 

Until  quite  recently  there  has  been  little  relation  between  the 
research  on  reasoning  and  research  on  problem  solving  of  the  sorts 
discussed  in  the  previous  sections  of  this  chapter.  Sometimes  this 
separation  has  been  justified  on  the  grounds  that  syllogistic  reasoning  is 
"deductive”  while  problem  solving  is  "inductive."  We  have  seen  that  this 
distinction  does  not  hold  water.  While  a  syllogism  is  a  deductive 
structure,  finding  valid  steps  or  testing  whether  proposed  steps  are  valid 
is  not  a  deductive  process.  Indeed,  the  major  process  in  evaluation  of  a 
propositional  or  categorical  syllogism  is  an  attempt  to  find  a  proof  of  the 
conclusion,  the  process  that  we  discuss  in  Section  II. A  as  the  prototypical 
example  of  goal-based  problem  solving.  For  linear  syllogism  problems,  the 
major  process  is  an  example  of  inductive  problem  solving  as  that  concept  is 
used  in  Section  IV,  in  which  the  subject  forms  an  integrated  representation 
of  the  premises  using  the  structure  of  an  ordered  list  induced  from  the 
order  relations  that  the  premises  state. 

From  the  fact  that  all  reasoning  involves  problem  solving,  however,  it 
does  not  follow  that  there  is  no  need  for  special  theory  in  the  domain  of 
syllogistic  reasoning.  To  understand  human  reasoning,  we  must  understand 
the  meanings  that  people  attach  to  words  and  the  rules  of  inference  that 
constitute  their  systems  of  "natural  logic”  as  well  as  the  structure  of  che 
control  system  thac  guides  their  problem  solving  search.  Recent 
investigations  have  progressed  significantly  on  these  questions. 


VI.  Conclusions 


The  literature  reviewed  in  this  chapter  includes  analyses  of  problem 
solving  in  a  few  dozen  tasks.  Important  general  characteristics  have 
emerged  in  these  analyses.  One  way  to  express  these  characteristics  is  to 
consider  the  task  of  analyzing  problem  solving  in  a  new  domain.  Analyses 
that  have  been  provided  give  quite  strong  guidance  about  the  kinds  of 
processes  and  knowledge  structures  that  one  should  look  for  in  an 
investigation  of  problem  solving. 

First,  it  is  important  to  investigate  the  subjects'  knowledge  and 
processes  for  representing  the  problem.  If  the  subjects  do  not  have 
special  training  in  the  problem  domain,  they  must  construct  a  problem  space 
that  includes  representations  of  the  problem  materials,  the  goal, 
operators,  and  constraints.  If  subjects  have  special  training  or 
experience  in  the  domain,  their  prior  knowledge  includes  general 
characteristics  of  the  problem  space,  and  their  representations  of 
individual  problems  are  based  on  this  general  knowledge.  Experts  are 
cognizant  of  general  methods  that  can  be  used  for  solving  problems,  and 
their  representations  of  problems  include  use  of  problem  information 
relevant  to  the  choice  of  a  solution  method. 

A  second  major  task  is  to  characterize  the  problem  representations 
that  subjects  form  in  their  understanding  of  the  problem.  In  relatively 
unfamiliar  domains,  problem  solving  is  primarily  a  process  of  search,  and 
the  problem  representation  determines  the  space  of  possibilities  in  which 
the  search  will  occur.  Some  basic  features  of  the  problem  space  depend  on 
the  problem  itself,  of  course.  A  problem  may  present  constraints  mainly  on 
the  operators  that  are  permitted  in  trying  to  achieve  a  goal,  or  on  the 
arrangement  of  materials  that  is  acceptable  as  a  solution,  or  may  present 
materials  and  require  induction  of  a  pattern  or  rule.  These  alternatives 
lead  to  differences  in  the  problem  space:  a  space  of  possible  sequences  of 
actions,  a  space  of  possible  solution  arrangements,  a  space  of  possible 
structures,  or  some  combination  of  these. 

The  problem  space  constructed  by  an  individual  subject  also  is 
determined  by  the  method  of  search  that  the  subject  uses,  features  of  the 
problem  that  are  used,  and  general  knowledge  that  is  applied.  In  a  problem 
of  transforming  a  situation  by  a  sequence  of  actions,  subjects  typically 
use  some  form  of  means-ends  analysis.  They  may  distinguish  between 
features  of  the  situation  that  are  more  or  less  essential  for  the  solution, 
and  organize  their  search  by  a  process  of  planning  that  focuses  on  the  more 
essential  features.  Searching  in  a  space  of  possible  solution  arrangements 
typically  involves  generating  partial  solutions  on  a  trial  basis,  and  is 
influenced  by  the  subjects'  knowledge  of  constraints  that  can  be  used  to 
limit  the  candidate  arrangements  that  are  considered.  Solution  of 
induction  problems  i3  similarly  influenced  by  the  subjects'  knowledge  of 
general  constraints  on  possible  solutions,  which  may  be  used  in  generating 
and  testing  hypotheses  or  in  a  process  of  synthesizing  or  abstracting 
structures  from  the  features  of  individual  objects  that  are  provided. 

In  problem  solving  for  which  subjects  have  special  training  or 
experience  the  problem  space  of  operators  and  constraints  is  provided  by 
the  subjects'  existing  knowledge.  Knowledge  of  experts  is  highly 
organized,  and  Includes  solution  methods  and  concepts  for  representing 
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problems  at  varying  degrees  of  generality  and  abstraction.  For  relatively 
simple  problems,  experts'  knowledge  often  provides  a  basis  for  immediate 
recognition  of  solution  methods  as  well  as  of  detailed  features  relevant  to 
the  solution.  Their  knowledge  of  relations  among  methods  and  operators  and 
of  constraints  in  the  domain  enables  problem-solving  performance  to  occur 
in  a  highly  organized  planful  manner. 

While  the  study  of  problem  solving  and  reasoning  has  progressed 
rapidly  and  achieved  a  substantial  level  cf  knowledge  and  theory,  several 
significant  questions  remain  largely  unanswered.  We  will  mention  four  of 
these. 


First,  while  performance  of  experts  on  relatively  simple  problems  is 
beginning  to  be  understood,  little  is  known  about  their  performance  on 
problems  that  are  difficult  and  deep.  It  is  possible  that  on  problems  for 
which  an  expert's  knowledge  does  not  provide  a  ready  method  of  solution, 
the  expert  resorts  to  "weak  methods"  of  search  and  analysis  fundamentally 
similar  to  those  used  by  novices.  It  also  is  possible  that  powerful 
processes  of  reasoning  in  a  domain  are  acquired  by  experts,  and  that  these 
are  used  in  solving  problems  for  wnich  specific  solution  methods  have  not 
been  worked  out  and  stored  in  memory. 

A  second  question,  closely  related  to  the  first,  involves  the  general 
nature  of  problem  solving  in  its  more  powerful  and  productive  forms.  We 
have  referred  to  discussions  of  productive  thinking  by  Duncker  (1935/1945) 
and  Wertheimer  (1945/1959)  and  have  noted  progress  that  has  been  made  on 
some  of  the  issues  that  they  raised.  They  also  raised  a  critical  issue 
that  has  not  been  addressed  strongly  in  recent  discussions.  This  is  the 
process  of  constructing  more  powerful  representations  of  problems  by 
analysis  of  problem  components.  The  initial  representation  of  a  problem 
frequently  does  not  include  important  relationships  that  are  required  for  a 
meaningful  solution,  but  the  problem  solver  is  able  to  construct  a 
reformulation  that  includes  its  important  structural  features. 

A  third  question  for  which  there  are  promising  preliminary  results  but 
much  more  to  be  done  is  the  question  of  learning.  Analyses  of  acquisition 
require  understanding  of  the  skills  and  knowledge  that  is  acquired,  and  the 
significant  accomplishments  in  characterizing  skill  and  knowledge  in 
problem  solving  provide  a  promising  basis  for  investigation  of  learning. 
Recent  proposals  regarding  acquisition  of  cognitive  skill  such  as  those  of 
Anderson  (1982),  Anzai  and  Simon  (1979),  Neches  (1981)  and  Neves  (1981) 
provide  significant  steps  in  the  analysis  of  learning  processes. 

A  fourth  question  involves  the  theoretical  power  of  general  principles 
in  the  analysis  of  problem  solving  and  reasoning.  The  analyses  that  are 
reviewed  in  this  chapter  provide  detailed  hypotheses  about  performance  in 
specific  tasks,  and  are  strongly  testable  at  the  level  of  their  assumptions 
about  specific  processes.  The  assumptions  made  at  a  more  general  level  are 
more  heuristic.  They  involve  concepts  and  principles  that  provide 
significant  guidance  in  constructing  hypotheses  about  specific  cognitive 
structures  and  processes,  but  they  rarely  constrain  those  hypotheses  in 
vtaolly  specifiable  ways.  The  question  whether  complex  processes  of  problem 
solving  and  reasoning  are  constrained  by  significant  underlying  formal 
principles  is  an  open  question.  Some  investigators  (Kell,  1981;  VanLehn, 
Brown  &  Greeno ,  in  press)  have  urged  that  research  should  attempt  to 


Section  VI,  Conclusions 


Page  120 


discover  general  principles  with  deductive  power  that  would  significantly 
constrain  characteristics  of  process  models.  Ochers  (Newell  *  Simon,  1976) 
have  noted  thac  there  are  good  reasons  for  expecting  that  complex  cognition 
is  constrained  by  relatively  weak  structural  principles,  of  the  kind  that 
are  characteristic  of  present  theoretical  analyses. 

A  review  of  any  significant  body  of  scientific  research  can  be  closed 
with  che  remark  thac  much  has  been  accomplished ,  and  much  more  remains  to 
be  done.  This  seems  particularly  ape  for  cne  psychology  of  problem  solving 
and  reasoning.  The  progress  chat  has  been  made  in  che  1960s  and  1970s  in 
this  domain  has  been  substantial,  and  concepcs  and  methods  are  now 
available  that  will  enable  future  investigations  to  address  issues  of 
further  significance. 
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